My formula input(0)*h(0)+input(1)*h(1)+input(2)*h(2)+input(3)*h(3) was incorrect, input was in the wrong order.
It has to be input(3)*h(0)+input(2)*h(1)+input(1)*h(2)+input(0)*h(3) and the filters had to be applied in reverse order.
The other mistake I made was that the downsampler was taking the first sample of the four, while to be the same as the formula I just gave, it should be the last sample of the four. This can easily been done in this Simulink block, but the problem is that selecting another sample than the first adds latency and outputs first a zero. To compensate for that, I added a delay element after the commutator decimator.
And now both implementations give the same result. Problem solved!
Interesting discussion of this issue on https://www.dsprelated.com/thread/8719/polyphase-decimation-filter-simulink-implementation-of-commutator-variant-not-as-expected