AnimeSuki Forum - View Single Post

Renegade334 · 2024-02-07, 20:37

I'm posting this, after deciding to wind down my use of Stable Diffusion. I cannot deny that this strange journey was quite fun in its own way; much of the gratification came from accidentally stumbling upon this happy visual coincidence that occurs after generating ten or twenty artistic aberrations, exercises in blandness or outright horrors. It's a bit like playing at the casino, really. I nevertheless can't hide from the truth: it's a time-consuming pastime and there always comes a time when one just arrives to the following conclusion, "eh, time for something else; hopefully something more constructive". It was about time, too: real life has been rather burdensome lately (my mother's health concerns, bad weather, life getting more expensive). In the past couple of days, I even started deleting more than half of my diffusion models, figuring that my PC would appreciate the extra free storage space.

Adelante, amigos!

I'm sure you'll recognize the characters and the series they belong to. Don't be surprised by the high number of female characters - the vast majority of LoRA models at Civitai, Pixai and Huggingface are of the female persuasion.

Funnily enough, I started my foray into visual AI because a friend of mine asked me to Photoshop together a rather complex picture, which he'd use as cover for a fanfiction of his - and I struggled with the task (there were two pics/renders that refused to work together, so adamantly so that I even considered using Blender to alter the source material and forcefully resolve the issue). So I turned to Stable Diffusion to see if it couldn't cut this Gordian knot for me...and I faceplanted even harder.

The results were even worse. But despite that resounding failure, I was impressed with some of my first experiments and I ended up falling into a strange rabbit hole. It's now time for me to pull myself out of it and walk away. For a while, I think.

...As an impromptu parting gift...remember those NSFW pictures I posted in the Code Geass sub-forum? Well, I've been juggling different diffusion models to see how their outputs differed from one another despite using almost identical values (seed numbers, sampling and hires steps, etc.) as input parameters. I simply ran the same CG pics' prompts through a more visually intricate, aesthetically stringent and watercolor-oriented model that has been yielding me some real gems lately. Consider this - for the time being - my AI swansong.

(MOSTLY) NOT SAFE FOR WORK: (Now sorted by character to better keep track of what needs to be reupped)

NSFW

Sorry; dynamic content not loaded. Reload?

I have...other...images like this that are EVEN more unsafe, but I'm pretty sure the forum rules won't let me post them here, so I shall stay on the right side of things and abstain.

I don't think I'll be updating this post with new stuff the way I did with for the one above (if you haven't already, do check it out - you might've missed some additions!), but I'm in the middle of cleaning out my folders and pruning out unneeded stuff, so who knows. And let us allow this thread to fall back into a well-deserved torpor, once again.

Spoiler for ever-ballooning changelog + FINAL UPDATE ON THE MATTER:

EDIT - Imgur, the ever-so-frustrating hoster everyone likes to love, has a permanent grudge against these two sets of pictures, so I'm uploading them on a different image host (Imgbox), same as for the Kusanagi Motoko picture above. I just don't want to keep reupping these all the time. They barely last two days.

NSFW

you know the drill: content is _not_ safe for work

Sorry; dynamic content not loaded. Reload?

EDIT two - I've reupped for nostalgia's sake most of my old signatures here - beware, though, some of them are actually above the forum's size limit:

Spoiler for antediluvian stuff - do people still use forum signatures, anyway?:

2024-02-07, 20:37	Link #142
Renegade334 Sleepy Lurker Graphic Designer Join Date: Jul 2006 Location: Nun'yabiznehz Age: 38	I'm posting this, after deciding to wind down my use of Stable Diffusion. I cannot deny that this strange journey was quite fun in its own way; much of the gratification came from accidentally stumbling upon this happy visual coincidence that occurs after generating ten or twenty artistic aberrations, exercises in blandness or outright horrors. It's a bit like playing at the casino, really. I nevertheless can't hide from the truth: it's a time-consuming pastime and there always comes a time when one just arrives to the following conclusion, "eh, time for something else; hopefully something more constructive". It was about time, too: real life has been rather burdensome lately (my mother's health concerns, bad weather, life getting more expensive). In the past couple of days, I even started deleting more than half of my diffusion models, figuring that my PC would appreciate the extra free storage space. Adelante, amigos! I'm sure you'll recognize the characters and the series they belong to. Don't be surprised by the high number of female characters - the vast majority of LoRA models at Civitai, Pixai and Huggingface are of the female persuasion. Funnily enough, I started my foray into visual AI because a friend of mine asked me to Photoshop together a rather complex picture, which he'd use as cover for a fanfiction of his - and I struggled with the task (there were two pics/renders that refused to work together, so adamantly so that I even considered using Blender to alter the source material and forcefully resolve the issue). So I turned to Stable Diffusion to see if it couldn't cut this Gordian knot for me...and I faceplanted even harder. The results were even worse. But despite that resounding failure, I was impressed with some of my first experiments and I ended up falling into a strange rabbit hole. It's now time for me to pull myself out of it and walk away. For a while, I think. ...As an impromptu parting gift...remember those NSFW pictures I posted in the Code Geass sub-forum? Well, I've been juggling different diffusion models to see how their outputs differed from one another despite using almost identical values (seed numbers, sampling and hires steps, etc.) as input parameters. I simply ran the same CG pics' prompts through a more visually intricate, aesthetically stringent and watercolor-oriented model that has been yielding me some real gems lately. Consider this - for the time being - my AI swansong. (MOSTLY) NOT SAFE FOR WORK: (Now sorted by character to better keep track of what needs to be reupped) NSFW NSFW Sorry; dynamic content not loaded. Reload? I have...other...images like this that are EVEN more unsafe, but I'm pretty sure the forum rules won't let me post them here, so I shall stay on the right side of things and abstain. I don't think I'll be updating this post with new stuff the way I did with for the one above (if you haven't already, do check it out - you might've missed some additions!), but I'm in the middle of cleaning out my folders and pruning out unneeded stuff, so who knows. And let us allow this thread to fall back into a well-deserved torpor, once again. Spoiler for ever-ballooning changelog + FINAL UPDATE ON THE MATTER: EDIT #1: replaced some dead links. It's rather perplexing to see fresh uploads die this quickly...maybe it has something to do with the hatred for all things AI-generated on Imgur, who knows. EDIT #2: did some Spring cleaning on this post while I was cleaning up my HDD and uploading the remaining few goodies worth posting; I have now reordered all pics based on subject character as it helps me resurrect pictures that are surprisingly quick to get deleted from Imgur...for some reason (despite being marked "Mature", hidden and not posted on the community, which is annoying). I'll be reupping those dead links whenever I drop by again; Imgur seems to have it in especially for Motoko Kusanagi and Milly Ashford - just to cite two that I've had to restore twice already. EDIT #3: had further discussions with a fellow Stable Diffuser on the subject of multi-LoRA compositions and he advised me to switch to "Latent" mode in Regional Prompter and force SD's precision level to fp32 across the board. The results are fantastic (much better separation of character traits and higher detail even at low-res), but the cost to speed generation is, on the whole...just loathsome. Not that it doesn't amount to water under the bridge at this point. shrugs Anyway, I added two new pics at the bottom of the TLDR tags right above. Cheers. EDIT #4: *Oh. Em. Eff. Gee.* I just discovered an extension called Multidiffusion Upscaler for Automatic1111, which looks like an instant, no-look-back replacement for Regional Prompter. Why in God's good name did I only learn of it now? I only have to test it with hires fix to make sure I'm not imagining things. EDIT #5: well, I've got a new personal record for upscaling ratio: 2.5, up from 2, which allowed me to create a 1280x1920px image, bigger than the 1024x1536px I thought would be the hard limit for me. So...in the end, the MdU extension might be a smidge better than RP thanks to its improved VRAM management and memory savings, but it's still a lot more time-consuming than I'm comfortable with. Also, when it comes to multi-character compositions, it doesn't feel like it integrates the individual components into the greater whole with as much synergy and interaction as RP does; they feel a bit independent of one another as if they'd been inpainted separately, which...sometimes results in perspective issues (e.g. one individual being drawn a LOT bigger than the other, despite having same-sized regions specified in the prompt). It does however, completely negate the risk of blending different characters into a single one, as RP is, most unfortunately, quite prone to do. I'm still seeing some strange outputs, though. I also added a 1920x1280px pic (a landscape variant of the 1280x1920 one) - the faces were then extracted to separate image files, run through a no-prompt img2img (I made the mistake of NOT enabling After Detailer during initial generation) and Photoshopped back in. EDIT #6: this will be my final contribution to this post aside from routine maintenance (curse you, Imgur deletions!). I tried the Multidiffusion extension and I found it a most interesting piece of optional software, in some ways better than Regional Prompter but in some ways more...lacking and irritating. The memory management system is impressive (my SD installation on my potato PC became less crash-prone) but Tiled VAE and Tiled Diffusion do make everything slow down to a crawl even after removing --medvram from SD's launch parameters. The upside? For the first time since I started diffusing, I've been able to generate 1920x1280px and 1280x1920px images - something that I thought was too much to ask from my aging PC until now. MD does furthermore allow more than two LoRAs per picture! Without using ControlNet and img2img inpainting? Foul witchcraft, I say! But it does. However, one ensuing problem that I've been grappling with for the last two days lies in that if the bounding boxes are too close to one another, the separate outputs tend to fade into each other instead of fusing like RP does. There are also perspective issues because MD doesn't allow you to specify the desired depth for each region (ControlNet can solve this, but it's more resource-taxing, naturally). On a different topic, I'm sure some will ask, "your pics...why is it always Milly and Kallen together? Why not C.C. or Kaguya? It's kinda repetitive..." Well, here's the thing with Stable Diffusion: you start by testing small, basic prompts and figure out what works and what doesn't (SD doesn't use natural speech, by the way - it's all about how you parse the description); you take the prompt that seems to produce the best or most accurate results and underline the keywords (or strings thereof) that seem to make a difference (like "finely detailed face and eyes", "dramatic lighting", "from above", "medium shot" or "cowboy shot", etc.). Then you build upon it; you tweak a few things here and there and try to move out of your comfort zone. You test different viewing angles, poses, clothes, camera/atmospheric styles. It's a recursive exercise of trial and error, pure and simple...and I did a lot of it while trying to create a triple-LoRA image, which never seemed to quite work for me. Regional Prompter just spat out two characters, one...or fused them altogether into an eldritch, multilimbed and highly elastic creature. Two separate characters seemed to be the limit for THAT extension. For my choice character models...well, the Kallen and Milly LoRAs were of very good quality and I had some success with C.C., although I never quite managed to get her bored expression just right. Villetta and Cornelia were also very nice (though Villetta often showed disconnected ponytails and SD refused to get her skin tone right), but Lelouch...! Urgh, what a tricky bastard he was - either I got good outputs or very bad ones. Nothing in between. EDIT #7: Multidiffusion Upscaler (region prompt + tiled VAE options both activated) + ControlNet (either Depth, Normal or Segmented) + Hires fix (4x-UltraSharp) is an absolutely brilliant combination (ControlNet's finally stable, I finally got Insightface to install from within SD's venv session and xformers to properly work). OpenPose is still a bit iffy (I sometimes got characters in the right pose, but facing the other way), but I've had fantastic results with Depth and Normal. Just look at three pictures (#1: C.C. on her bed #2: Milly in the snow #3: Lelouch + Kallen + Milly) to see the sheer possibilities this combo unlocks. It neatly solves the perspective issues I previously had with MultiDiffusion and Regional Prompter. The only issue with Depth and Normal is that you need a reference picture (for the character pose), and unless you have one on hand, you'll be forced to use additional software like VRoid Studio or PoseMy.art to create that ControlNet reference picture, therefore increasing your workload. Still, it was well worth the trouble for me. Also, on the topic of generation speed, it is highly advised to go into Settings > Hypertile and turn on all the checkboxes to shave off a massive amount of time (could up to four times faster!) but I have yet to see if it causes massive visual differences in the output. And that's it. Voilà. I'm done. Folder's all cleared out. Last but not least: For the record, my latest multi-LoRA works have also seen me going back to Photoshop for some heavy post-processing/fine-tuning (I found it easier to run cropped pics of the faces, hands and bodies through After Detailer img2img then copy-pasting everything back together in Photoshop, rather than regenerating). A bit ironic, but not surprising. Anyway, I've therefore uploaded a few more images as proof of my latest breakthrough, and I think it very fitting that these would be my last uploads (the final one being Lelouch sitting in a fauteuil, flanked by Kallen and Milly) for this post - multi-LoRA composition has long been a road fraught with failure and setback for me, and I'm happy, for the time being, to leave the visual AI field with my head held high and able to brag about finally overcoming hurdle that long stood in my way. But enough of my technical rambling---enjoy! EDIT - Imgur, the ever-so-frustrating hoster everyone likes to love, has a permanent grudge against these two sets of pictures, so I'm uploading them on a different image host (Imgbox), same as for the Kusanagi Motoko picture above. I just don't want to keep reupping these all the time. They barely last two days. NSFW you know the drill: content is _not_ safe for work Sorry; dynamic content not loaded. Reload? EDIT two - I've reupped for nostalgia's sake most of my old signatures here - beware, though, some of them are actually above the forum's size limit: Spoiler for antediluvian stuff - do people still use forum signatures, anyway?: __________________ << -- Click to enter my (dead) GFX thread. Last edited by Renegade334; 2024-03-07 at 12:50. Reason: Replacing dead links + reorganizing them to better identify dead counterparts + FINAL UPLOADS (2024.02.25)[collection:complete]