Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ReshapeLayer and FlattenLayer should not allow in-place computation #3393

Closed
wants to merge 1 commit into from

Conversation

kkhoot
Copy link
Contributor

@kkhoot kkhoot commented Nov 27, 2015

This PR is to prevent incorrect gradients caused by in-place reshaping of Blobs in ReshapeLayer and FlattenLayer.

If bottom and top blobs are the same in both of these layers and they are on top of a layer which uses the shape of its top blob for backward (such as SliceLayer or PoolingLayer) , then its gradients can be calculated in a wrong way. It can also allow bottom and top blobs of different counts since currently the count checks are performed after reshaping the blob.

A simple solution to this is not allow in-place operation for these layers. Another solution can be save the bottom shape and restore it during backward.

This PR is the former simpler solution.

@ronghanghu
Copy link
Member

While so far only neuron layers should be applied in-place, I am not sure if there is anyone who replies on in-place reshape.

@seanbell
Copy link

Perhaps this should be generalized to throw an error for all layers that aren't neuron layers?

@ronghanghu
Copy link
Member

Perhaps this should be generalized to throw an error for all layers that aren't neuron layers?

Yes, I believe this should be correct guideline of in-place computation.

@kkhoot
Copy link
Contributor Author

kkhoot commented Nov 30, 2015

For me, unlike other common layers these reshaping layers seemed to work in-place since they do not change the input values, and I had been using that way for months. I recently found it did not work during debugging. I am not sure if this is a common case, but IMO it is better to be prevented or documented at least.

@seanbell
Copy link

Generally I think in-place should be disallowed except in explicit layers where the coder for that layer has indicated that it is safe. I think this needs to be part of an explicit CHECK statement.

There are also considerations in terms of the previous layer. For example, if a layer needs the top in the backwards pass, then the next layer can't be in-place. This is also not checked.

@jeffdonahue
Copy link
Contributor

Thanks for the PR @kkhoot. I think these checks should be moved to the LayerSetUp method of each class, but otherwise this LGTM. I agree with @seanbell that this should be generalized and in-place computation should be disallowed by default, but having these two checks is still better than the status quo.

I started trying to clean up the in-place behavior a while ago but never really completed it. I was trying to have Net infer whether computation could be done in-place in each layer via a bunch of extra layer attributes -- see modified vision_layers.hpp if interested (from back in the day when all layer headers were in vision_layers.hpp). I think it kind of worked but I never really got around to polishing and testing it. The code I linked could be helpful if someone wants to just steal the bools for each layer (though quite a few more have been added since) -- I think the conjunction of !ForwardReusesBottomData && !BackwardReusesTopDiff for each layer would be the check you'd want for checking that in-place is OK.

@kkhoot
Copy link
Contributor Author

kkhoot commented Dec 7, 2015

Thanks for comments. I also agree with @seanbell 's idea. Hope @jeffdonahue 's nice work will be merged someday.

jeffdonahue added a commit that referenced this pull request Jan 26, 2016
@jeffdonahue
Copy link
Contributor

I went ahead and merged this in 940b299 (with a couple of spacing fixes) as it's useful to have even without fancier/default checks. These checks should be removed if/when such checks are implemented. (And I realized @kkhoot put the check in the FlattenLayer::Reshape rather than LayerSetUp because FlattenLayer currently has no LayerSetUp.)

@kkhoot kkhoot deleted the fix_reshape branch February 10, 2016 05:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants