-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add OpenCL support #32
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the idea
I think that the options now are a bit confusing TBH. For example, we have a |
I guess this is fine but in the case where opencl isn't built in, should we ifdef the option out? In the case where you'd use --force-yuv, I'm guessing you'd probably want to use opencl if it is built in, so you'd need to pass both options. You'd only need to disable opencl to test performance gains or if there's a problem. I would like to have opencl enabled by default if it was built in and the other option could be called whatever but I would like --to-yuv. |
I made it like you suggest to see how it works out. I meant |
} | ||
#else | ||
/* Silence compiler warning when opencl is disabled */ | ||
(void)(y_invert); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can move this inside the #ifdef
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The point of this is to not leave y_invert variable declared but not used in the case where opencl is disabled. If we put it inside the ifdef, the warning still happens.
src/main.cpp
Outdated
{ 0, 0, NULL, 0 } | ||
}; | ||
|
||
int c, i; | ||
std::string param; | ||
size_t pos; | ||
while((c = getopt_long(argc, argv, "o:f:g:c:p:d:la::t::", opts, &i)) != -1) | ||
while((c = getopt_long(argc, argv, "o:f:g:c:p:d:la::t::e::", opts, &i)) != -1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why does e
have optional parameter when it doesn't take any value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
I'll test this later today and hopefully merge, if there is nothing else that you have planned to do. |
I don't have anything else planned at this time. |
@ammen99 Can you test this patch (on top of opencl branch)? Adjust the value of SIZE_MULTIPLIER and see if there is a relation to how much cpu is being used. I've tried between 1-32, with and without vaapi using libx264 and h264_vaapi. Here the results are that the higher the number (meaning the more memory allocated and copied), the less cpu that is being consumed (via htop). It seems counterintuitive that copying more data would result in less cpu usage. Is it the case for you too? Any idea what might be going on here? |
As discussed in irc, it has been realized that the cpu was probably doing less because the gpu was doing more copying, essentially the same as calling usleep after copying. Now that I've set the buffers to the correct size, I've tested on a box with libx264 and h264_vaapi, with and without opencl, while idle and while moving mouse cursor, recording the top two (highest cpu usage) threads and the results are always better in the opencl case except libx264+sws_scale when idle. It also should now clean up properly without crashing.
|
So this should be ready for merging, right? Could you please rebase on top of master? |
OpenCL will be used for converting from rgb to yuv data when built with OpenCL support and using vaapi with to-yuv option.
Namely this avoids reallocating the output buffer each frame.
This makes it so OpenCL is used for rgb to yuv conversion instead of sws_scale(), when --to-yuv option is used.
And other stuff.
…ption The new yuv420p shader does pixel averaging to produce uv values from 2x2 subsample. It also supports odd size resolutions in the vaapi case. The --no-opencl option (or -n) disables opencl if it was built in.
Yes, should be ready to go now. |
I tested this final version with both OpenCL on and off, and both VAAPI on and off, seems to work fine. Thanks for the work on this, and sorry for the delay. |
No problem, thanks for testing. I'd like to see this 'fixed' in the gallium vaapi driver and test the internal converter for comparison, but maybe that will come in the future. |
This attempts to use OpenCL to convert rgb data to yuv instead of sws_scale() when wf-recorder is built with OpenCL support and the --to-yuv or -t option is passed. This helps offload work to the gpu that would otherwise be done on the cpu.