When IP-Adapter FaceID is initialized, it modifies the pipeline's UNet
cross-attention layers. Calling raw pipeline() without face embeddings
leaves these layers in a broken state, causing corrupted output.
Solution: When IP-Adapter is loaded but no face_image provided, call
ip_model.generate() with s_scale=0.0 and zero embeddings to properly
disable face conditioning while satisfying the modified layers.