* Fix for running via DirectML Fix DirectML empty image generation issue with Flux1. add CPU fallback for unsupported path. Verified the model works on AMD GPUs * fix formating * update casual mask calculation