My recommendation would have been multiple rigged models, and one skeleton per rig. On the same project so you can see them all. I think this makes for easier rigging and even a coherent data structure on the game end. But it involves some (not a lot) extra code on the game end to manage which skeleton (or "model") to use and which animation to play. So whoever instructed you to keep it in one skeleton was interested in not having to code extra logic, but that's going to severely slow down the pipeline 'cause you only have to code this once, but you'd have to wrestle with the inconveniences for every character you need to animate.
But anyway... that aside.
Two rigged models in one skeleton is a nightmare not just for the modeler/animator but for game performance too, 'cause you'd have extra bones not contributing to the visible character but having to be updated anyway. And for the animator, you'd have to manage showing and hiding images.
One skeleton that needs to swap images presents the same problem of having to manage showing and hiding images, but it also includes having to do guesswork for the setup pose (rigging process), 'cause if, for example, your setup pose is front facing. How are you going to align your parts and pivots when they're sideways? I suppose it depends too if the art style is forgiving in terms of alignments.
The transitioning problem is a separate problem. You certainly can't count on plain mixing/crossfading to make transitions for you when you need to turn your character, or when the character needs to drastically change the bone arrangement to look a certain way. All mixing does is interpolate them naively as if they were abstract points on a plane and rotations and scales, because that's all they really are under the hood.
PS
Throughout this process, if you're using Spine 3.0, I highly recommend using the handy new hotkey CTRL
+ SHIFT
+ L
.
This adds keys to the dopesheet to save the current pose, so you can freeze that pose, or copy and paste it to another animation or time.
It's called "Key Dopesheet" in the hotkeys file. But you can also think of it as "Key Current Pose".
This will be handy for copying or moving around which images are visible or hidden from any given point in your animation.