Today's systems from smartphones to workstations are becoming increasingly parallel and heterogeneous: Processing units not only consist of more and more identical cores - furthermore, systems commonly contain either a discrete general-purpose GPU alongside with their CPU or even integrate both on a single chip. To benefit from this trend, software should utilize all available resources and adapt to varying configurations, including different CPU and GPU performance or competing processes. This paper investigates parallelization and adaptation strategies applied to the example application of dense stereo vision, which forms a basis i.a. for advanced driver assistance systems, robotics or gesture recognition and represents a broad range of similar computer vision methods. For this problem, task-driven as well as data element- and data flow-driven parallelization approaches are feasible. To achieve real-time performance, we first utilize data element-parallelism individually on each device. On this basis, we develop and implement strategies for cooperation between heterogeneous processing units and for automatic adaptation to the hardware available at run-time. Each approach is described concerning i.a. the propagation of data to processors and its relation to established methods. An experimental evaluation with multiple test systems reveals advantages and limitations of each strategy.