The world of AI image generation is moving extremely fast. With Stable Diffusion, a highly interesting new model has now been launched and is competing with DALL-E and Midjourney – but retaining its very own identity. Developed by Stability AI, the doors for developers and scientists were already opened a few weeks ago. Now comes the step to the public, putting Stable Diffusion through its first, serious stress test. Midjourney has already been able to benefit from this, exploiting the power of Stable Diffusion with a new beta.
Previously, more than 10,000 users are said to have already had access to the new tool. In contrast to DALL-E or Midjourney, the use is not behind a paywall, but can be installed with a bit of skill on standard GPUs in a passably equipped gaming computer – completely free of charge, since it is completely open-sourced! Alternatively, DreamStudio is also available as a web interface, which you can also test for free with a few complementary credits. You can add another 1,000 credits for 10 pounds.
“Stable Diffusion is a text-to-image model that will empower billions of people to create stunning art within seconds. It is a breakthrough in speed and quality meaning that it can run on consumer GPUs”, explains project leader Emad Mostaque.
“As these models were trained on image-text pairs from a broad internet scrape, the model may reproduce some societal biases and produce unsafe content, so open mitigation strategies as well as an open discussion about those biases can bring everyone to this conversation.”
“It understands the relationships between words to create high quality images in seconds of anything you can imagine. Just type in a word prompt and hit “Dream”.”
“The model is best at one or two concepts as the first of its kind, particularly artistic or product photos, but feel free to experiment with your complementary credits. These correlate to compute directly and more may be purchased. You will be able to use the API features to drive your own applications in due course.”
“The core dataset was trained on LAION-Aesthetics, a soon to be released subset of LAION 5B. LAION-Aesthetics was created with a new CLIP-based model that filtered LAION-5B based on how “beautiful” an image was, building on ratings from the alpha testers of Stable Diffusion. LAION-Aesthetics will be released with other subsets in the coming days on https://laion.ai.”
“Stable Diffusion runs on under 10 GB of VRAM on consumer GPUs, generating images at 512×512 pixels in a few seconds. This will allow both researchers and soon the public to run this under a range of conditions, democratizing image generation. We look forward to the open ecosystem that will emerge around this and further models to truly explore the boundaries of latent space.”
“The safety filter is activated by default as this model is still very raw and broadly trained. This may result in some outputs being blurred, please adjust the seed or rerun the prompt to receive normal output. This will be adjusted as we get more data. Please use this in an ethical, moral and legal manner. Make your mothers proud.“
How to use DreamStudio
To try DreamStudio for free in your browser, you can simply create an account at beta.dreamstudio.ai or log in with Google or Discord. You will be taken to the start page, where you can simply enter your desired prompt. Unlike Midjourney, you can’t set an aspect ratio, but choose the pixels for height and width individually. Currently, images with a maximum of 1024 x 1024 pixels are possible.
Other controls let you set the weighting of the prompt, the number of steps to the finished image (default: 50) and the number of generated images between 1 and 9. Please note: “Adjusting the number of steps or resolution will use significantly more compute, so we recommend this only for experts.” Finally, DreamStudio is particularly good at repeatedly achieving a similar result by using a seed.
“We hope you enjoy having a trillion images in your pocket, please do share and tag your output with #StableDiffusion!”