Black Forest Labs, the studio behind the Fluxfamily of AI image generators, announced last week the release of Flux 1.1 [Pro]. This comes just two months after the release of its original family of models including Flux 1 Pro (a closed source model with industry-leading capabilities), Flux 1 Dev (a noncommercial, open source model) and Flux Schnell (a fully open source model).
The Flux models marked a major leap in generative AI technology with their text generation capabilities, prompt adherence and overall image quality. Even the smaller models, Flux Dev and Flux Schnell, generated results on par with generations from MidJourney, and way better than the outputs provided by SD3, Stability’s much anticipated evolution over SDXL, which turned out to be somewhat underwhelming.
The new model has already made a mark, securing the top Elo score in the Artificial Analysis image arena—a leading benchmarking platform for AI models. It has outperformed every other text-to-image model on the market while being almost as fast as its smallest model.
The graph below shows the Elo score (image quality) on the Y axis and the generation speeds on the X axis. MidJourney enthusiasts may notice that their model is not represented—it’s so slow it’s literally off the chart. However, its Elo Score sits somewhere around 1100 points, just below Ideogram V2.
The new Flux Pro stands out in terms of pricing, with Flux1.1 Pro costing $0.04 per image—lower than many other models on the market, including the original Flux1 Pro. This pricing structure makes it a strong competitor over other paid services like MidJourney and Ideogram, which cost $96 and $84 a year each. The MidJourney and Ideogram options are also slower and offer a higher cost per token.
Sadly, Flux1.1 Pro cannot be run locally. Unlike its less powerful open source counterparts, such as the FLUX1 [Dev] and FLUX1 [Schnell] models, this new pro version is a closed source model, limiting users to access it through platforms like Together AI, Replicate, Fal AI, and Freepik. It cannot be fine tuned or personalized.
For those considering trying the model, some of these platforms offer a few credits for free generations, but once those are depleted, the best service according to our own criteria is Freepik. That’s because its Mystic workflow dramatically enhances generations with higher details and better aesthetics.
IT’S FINALLY HERE!
🔥 Freepik Mystic 🔥
“Any sufficiently advanced technology is indistinguishable from magic.” — Arthur C. Clarke ✨ Mystic is the most advanced AI generator to date with outputs directly in Full HD.
But what’s really Mystic? Let’s dive in 🧵👇 pic.twitter.com/nrlPTi0OWo
— Javi Lopez ⛩️ (@javilopen) August 27, 2024
There are no announcements regarding an open source 1.1 version of the FLUX1 [Dev] or FLUX1 [Schnell] models, but it’s clear that Black Forest Labs is focusing its efforts on great models for image and video creators.
Hands-On Testing and Review
We tried the new Flux model and the results were satisfactory. It is not a generational leap—like the move from SDXL from Flux—but it is certainly a welcomed upgrade.
It is overall very realistic, has great text generation capabilities and is very creative in artistic tasks and styles. It is a good, versatile model that provides fast generations without compromising quality.
Realism
Prompt: “Polaroid photo with VSCO filter, 1990, woman, night, flash photo, blonde, young face, beautiful shadows, tropical plants, inside an apartment, DSLR, camera flash, holding a handwritten sign on a notebook saying ‘Verification for Decrypt October 7, 2024.’ The woman is doing the peace sign with her other hand.”
The model excels at producing realistic images, improving upon the airbrushed look of the initial Flux models. While not perfect, the results are highly convincing, especially with proper prompting. At first glance, these images—both generated with Flux 1.1 Pro—could pass for real without nitpicking small details.
The lettering is consistent with the prompt, and hand rendering has improved, though it’s not quite perfect. It’s important to note that these are not hand-picked samples but the first two generations. When working with generative AI, the best results typically come after several generations and edits.
The lighting is consistent with a camera flash, focusing on the subject without illuminating the whole room. The VSCO filter enhances realism, and prompt adherence is excellent.
Comparing Flux 1.1 to Flux 1 shows that the generations are quite similar in terms of realism at first glance. However, using the same prompt, the new model produces a more natural pose and a more consistent body. For example, Flux 1 generated what could appear to be an additional leg, which Flux 1.1 avoided. This improvement has more to do with accuracy than overall realistic aesthetic.
Prompt Adherence
Prompt: “A white cat playing the piano, wearing sunglasses and a hat, wearing purple Hawaiian style, full body shot against a gray studio background with lighting elements and a pterodactyl hanging from the ceiling, commercial video screengrab. The wall has the text ‘Emerge by Decrypt'”
Flux 1.1 takes prompt adherence a step further compared to Flux 1 Pro, successfully incorporating more elements into the scene without missing the mark. Our first prompt for Flux 1 didn’t include the lighting elements or the pterodactyl. Additionally, the new generation is more realistic and feels more natural.
Spatial Awareness
Prompt: “A dog standing on top of a TV showing the word ‘Decrypt’ on the screen. On the left there is a woman in a business suit holding a coin, on the right there is a robot standing on top of a first aid box. The overall scenery is surreal.”
In terms of spatial awareness, Flux 1.1 and Flux 1 are comparable. Both generated all elements without issues. However, Flux 1.1 Pro seems superior when considering additional details. For instance, there’s less prompt spilling (when the model takes elements from the prompt and uses them in other areas). In the Flux 1.1 generation, the woman holds one coin with no visible additional coins, while Flux 1 generated a stash of coins next to the dog. Moreover, the error with the additional hand in Flux 1 Pro isn’t present in the newer model, and the surreal style is better represented in the Flux 1.1 generation.
Conclusion
Flux 1.1 Pro is overall more consistent and logical in its generations. If you can’t run a local model, it’s a very good competitor. It understands natural language, making it suitable for beginners, though this isn’t its primary strength. MidJourney tends to be more creative while enhancing poor prompts.
However, Flux 1.1 Pro is cheaper, faster, and generally better in quality than any current model, potentially making it the best option for those seeking good prompt adherence, quality, and text generation capabilities.
For those willing to pay for the model, any of the current options does the job. We liked the service provided by Fal.AI because it provides more control than the others. However, Freepik seems to be the best option for those who want a more pro experience. While slightly more expensive, it’s significantly more versatile, offering not only image generation services but also additional features like image upscaling, outpainting, draft-to-image generations, a background remover, and a library of content for experimentation.
Generally Intelligent Newsletter
A weekly AI journey narrated by Gen, a generative AI model.
Source link
Jose Antonio Lanz
https://decrypt.co/284932/meet-flux-1-1-pro-best-ai-image-generator
2024-10-07 19:00:55