/bbwai/ - bbwai

Stable Diffusion, Loras, Chatbots

BBW-Chan is supported by simple text or static image ads from a-ads.com.
Please consider whitelisting us in your adblocker if you enjoy your stay!


Mode: Reply
Name
Subject
Message

Max message length: 9999

Files

Max file size: 10.00 MB

Max files: 6

Captcha
E-mail
Password

(used to delete files and postings)

Misc

READ THE *RULES* BEFORE YOU POST!

[ / / ]

(4.04 MB 459x336 004a.gif)
(5.27 MB 424x240 008a.gif)
Hunyuan Video 01/12/2025 (Sun) 20:09:38 No. 30947
If you haven't tried out Hunyuan Video yet, you're missing out. Out of the box it can produce some interesting results. With Loras, what it can produce is pretty damn incredible. I've trained a few Loras using the diffusion-pipe with runpod. I'm currently on version 5.00 of my project. I've been training using a combo of 512x512 videos (33 frames) and images. I'll post some examples. Note that they've been converted from (.mp4) to (.gif) and down-scaled for upload here, so there is quality loss. I'd love to get some prompt ideas for testing. Comment if you have something you'd like to see and I'll post the results. Hunyuan uses natural language prompts.
>>31219 Well, it looked it was going in the right direction, so its worth always the try,
>>31220 Those first two examples of successful attempts I posted were the only two successful attempts I got out of 100+ generations, and even then they were far from perfect. A good lora = consistency, reliability, and quality. At the moment idk what I could change to drastically increase the odds for success and improvement. It just isn't worth the monetary gamble while crossing my fingers. Not only is it ~$20 per train, you then have to rent an A40 to test the lora with around 100 generations to see if it's successful and find the best epoch. It's a decent amount of time and money. I'm not giving up on inflation, I'm still thinking about it, it's just on the back-burner. Hopefully someone else cracks the code in the meantime.
Wow, am definitely going to have to try this. Thanks for all of the walkthroughs and gens so far. I know it's costly and time-intensive to test, but has anyone experimented with more exaggerated sizes? I guess the expansion/inflation gens basically do. I could post some gens to play with as well.
Anyway I would love to see more belly play like a woman playing with someone else belly or feeding play going with pair of the poking or pinching
>>31223 I will train a hunyuan lora for exaggerated size without the inflation process. I think it'd be pretty successful. Not sure how the physics would turn out as it would be entirely image based. >>31225 Poking is something I really want to better implement in v3.00. If you could recommend any videos depicting this it'd be greatly appreciated - something where the woman really sinks her fingers in deep would be mint. -drewski
>>31227 Does it need to be actual videos of whole woman or just belly? Can be AI content used too?
>>31228 Close-up's of belly work great, and yes videos. It could be the full woman as well, but not necessary. AI content is also fine. Thanks! -drewski
>>31229 would some of these samples be good enough for the training dataset: https://gofile.io/d/tPjMuP ?
>>31237 Thanks! I'll try to use a few of these. The one's that I really like have some edited in artifact effects, so not 100% sure. I'm going to really increase the size of the dataset for v3.0 so potentially, it shouldn't effect the final product.
(2.66 MB 736x416 vid_00152.webp)
(1.23 MB 736x416 vid_00070.webp)
(2.61 MB 736x416 vid_00069.webp)
(2.04 MB 736x416 vid_00055.webp)
Big thanks to all the info in this thread!
(938.66 KB 736x416 vid_00046.webp)
(2.73 MB 736x416 vid_00044.webp)
(1.89 MB 736x416 vid_00037.webp)
(1.95 MB 736x416 vid_00025.webp)
(1.04 MB 736x416 vid_00011.webp)
(3.63 MB 736x416 vid_00006.webp)
(1.66 MB 736x416 vid_00001.webp)
(1.64 MB 736x416 vid_00003.webp)
>>31242 Ok, great. Added few more. Just tell us how many samples and with what specifics you'd like to see. I guess many folks have their stashes pretty much under control, lol
>>31246 Do you have more belly play and belly button poke?
>>31245 Some great gens here, but whatever you've done in conversion, they've lost their metadata. Would be great if you could post in webm, maybe webp doesn't support the "comment" metatag?
(916.70 KB 512x512 McDWaddle002.webp)
(961.19 KB 512x512 McDWaddle003.webp)
(1.21 MB 512x512 McDWaddle001.webp)
(1.36 MB 512x512 McDWaddle004.webp)
>>31049 Thank you kindly, m8! I'm lovin' it!
Hey, i read that Hunyuan video-to-video can replace actors perfectly, maybe we can swap the actors from some scenes like Aunt Marge, Violet, or male scenes into females like the fat ghost from RIPD or others. Maybe even inflatable latex suit masks into heads to make it look like people inflating?
Also, any look making inflation vids?
(295.75 KB 736x416 vid_00159.webm)
(434.29 KB 736x416 vid_00152.png)
(412.53 KB 736x416 vid_00113.webm)
(1.31 MB 736x416 vid_00053.webm)
(2.11 MB 736x416 vid_00004.webm)
(772.62 KB 736x416 vid_00051.webm)
>>31249 Oops, thanks for pointing that out. Started playing with the fastvideo model and wildcard prompting for these, half the time as the gguf model but not quite getting the level of detail I want. Also including png of a previous gen with the meta info
(412.08 KB 736x416 vid_00006.webm)
(919.84 KB 736x416 vid_00062.webm)
(695.79 KB 736x416 vid_00082.webm)
(899.89 KB 736x416 vid_00085.webm)
(438.30 KB 736x416 vid_00090.webm)
(1.06 MB 736x416 vid_00104.webm)
(675.56 KB 736x416 vid_00160.webm)
(783.02 KB 736x416 vid_00161.webm)
(487.89 KB 1661x938 Screenshot 2025-01-20 134459.png)
>>31272 Thanks for sharing the png with your workflow info! I'm working through getting ComfyUI and Hunyuan video set up. I imported all the missing nodes of your workflow via the ComfyUI manager, but it keeps saying I'm missing these Get_Model, Set_Model, and Get_VAE and Set_VAE nodes. I can't seem to locate them. Apologies if this is an obvious question, I'll make up for it with some gens once this is up and running.
>>31275 I had the same issue with that workflow. Honestly I just deleted them and connected the model/vae nodes directly. They aren't necessary, just there to make things look cleaner.
>>31277 Thanks! Weirdly it seemed to resolve itself after I imported some other workflows, restarted ComfyUI a few times, and relaunched it.
>>31273 These look amazing. Are they all being generated via RunPod? I have a 3090 with 24 GB of VRAM, and am getting meh results with the .gguf model out of the box. Probably need to tweak settings.
>>31280 These were generated using hunyuan_video_FastVideo_720_fp8_e4m3fn.safetensors and I've had trouble getting consistent results like these so far but still switching between the two as I'm learning what all the levers and buttons do. Definitely need to tweak settings as I have a 4070Ti Super and 32GB of RAM for these generations. You should follow the info another posted in here which helped me a lot >>31039
(249.97 KB 736x416 vid_00169.webm)
(350.32 KB 736x416 vid_00174.webm)
(536.24 KB 736x416 vid_00180.webm)
(610.28 KB 736x416 vid_00211.webm)
>>31055 I've set my UI up exactly like this, but I'm not sure where to find the "llava_llama_fp8_scaled" clip. When I generate, all the frames are just black. Not sure if the two things are related. Any advice?
>>31289 llava_llama_fp8_scaled can be found here: https://huggingface.co/Comfy-Org/HunyuanVideo_repackaged/tree/main/split_files/text_encoders
(1.26 MB 2048x1200 vid_00002.webm)
(1.36 MB 768x448 vid_00011.webm)
(1.68 MB 768x768 vid_00019.webm)
(539.35 KB 800x480 vid_00006.webm)
(1.04 MB 448x768 vid_00018.webm)
(642.10 KB 448x768 vid_00023.webm)
>>31150 Utilizing the "Lora Double Block" node I got from YAW. Long rant incoming. A few more thoughts after a couple of days using runpod for the first time. It's amazing this tech exists for the average joe to throw $25 at the internet and have access to top of the line tech at their fingertips. It's less amazing that cloud tech isn't exactly...perfect. From slow loading of ComfyUI elements, to random disconnect/reconnect messages, to even SSH and root access being cut off/reset at odd times, to a pod sometimes having all the GPU's rented out but your data is there so you just... have to wait for who knows who long for one to free up, it's not 100% smooth. Beyond that, while it is indeed cheaper by a significant margin electricity wise, it surprised me that an A40 is not all that much *faster* at generating, at least at the resolutions I was doing on my 3080. Is it faster? Sure, but I expected... like... way faster. It's like.. idk... 20-30% faster at the resolutions I've been working with in the YAW workflow. Nothing mind blowing. At least it can do additional up-scaling and interpolation if I want but... tbh it adds so much generation time I generally don't bother. I'll admit I've been kind of stuck in a rut using ga1n3rb0t + the slime girl lora. The slime generally tends to cover up imperfections in hands and such. I really should branch out, but I've just been kind of... fascinated. Like watching a lava lamp with tits. I was disappointed that the breast growth lora didn't really work at all (though I did get one golden seed that did reverse breast growth). For anyone wondering if the Fast/Quantized/Distilled/GGUF versions of Hunyuan are worth playing with on their home computers... no. Not really. All the smaller models are vastly inferior to the full model. I tried out Nvidia Cosmos. It's.. interesting. Especially since you're capable of using the 14B model on the A40. But even being able to use a starting and ending image, the results were not great, at least for extreme anime fats. Maybe it would do ok with more realistic fats, but I didn't bother cause who wants to see that. It's nothing compared to what Kling/etc can do with Image2Video. As for the YAW workflow, I think it's the best one ATM. It's easy to understand and well-documented (for the most part). I do change a couple of things... I usually change the base (fast) resolution from 368x208 to 384x224 in both landscape and portrait (inputs 1 and 3 on step 4 in YAW). Almost no increase in generation time, better when doing intermediate i2v. I also generally do 117 frames instead of 141. The difference between a 5 or 6 second video is minimal, and the extra second of clip time adds over a minute of total intial + i2v generation time. Not worth it in my opinion. I don't know how some of these guys are starting at a much higher resolution and upscaling from there, I don't have the patience to wait 15+ minutes for ONE 6 second clip. I change the Teacache scheduler back to original (1x) on the initial generation. Using the cache here only speeds it up by 10-15 seconds, it's already pretty fast, and it presents a generally worse image for i2v. I do leave the teacache at 1.6x on the i2v though, the speed increase there really matters. With these settings I can generate the initial "preview" clip in 20-25s, and if I choose to progress the clip, I can get a mid-res I2V upscale at 448x768 in around ~4.5m. For a total time-per-clip of less than 5 minutes at 448x768. If I choose to go further with the flat 2x upscale to 896x1536, it's an additional 3 minutes (the result is more or less the same as just zooming into the video manually). So 8 minutes total for a 5 second clip at 896x1536. Interpolation can only be done at 73 frames, as trying to do it at 117 or above OOMs the A40. Using 141 frames makes the I2V take about 6.5 minutes, the additional upscale then takes about 3 minutes as well. So about 10 minutes for a 6 second clip at 896x1536. I tend to be zooming through generations, so I rarely want to wait upwards of 25% longer per generation for just 1 additional second per clip. Definitely disappointing we have no real way to extend a video from the last frame yet ion Hunyuan. Hope that comes soon. thanks for coming to my ted talk.
(531.10 KB 448x768 vid_00035-1.webm)
(1.21 MB 640x832 vid_00004.webm)
(841.33 KB 736x416 vid_00029.webm)
(1.12 MB 608x608 vid_00031.webm)
(2.00 MB 2048x1200 vid_00005.webm)
a few more because i'm a freak. also sobering to realize 12 + hours of gooning got me, in the end, about twelve 3-5 second clips that were worth posting. Holy shit what a time vampire.
(1010.54 KB 736x416 vid_00023.webm)
(592.06 KB 736x416 vid_00029.webm)
(626.66 KB 736x416 vid_00030.webm)
(1.06 MB 736x416 vid_00031.webm)
(711.11 KB 736x416 vid_00035.webm)
(654.74 KB 736x416 vid_00054.webm)
(423.42 KB 736x416 vid_00314.webm)
(1.46 MB 736x416 vid_00202.webm)
(440.41 KB 736x416 vid_00217.webm)
(552.72 KB 736x416 vid_00294.webm)
(713.73 KB 736x416 vid_00307.webm)
(921.34 KB 736x416 vid_00260.webm)
>>31140 Would love to see a mega of all the jinx ones you did.
>>31227 Anyway so are you going to make trainings of women with huge exaggerated hanging belly and chubbier love handles? Because I have some videos you could try training on
(322.68 KB 768x512 HY_00794.webm)
(381.58 KB 768x512 HY_00796.webm)
(285.29 KB 768x512 HY_00798.webm)
(618.61 KB 768x512 HY_00823.webm)
(312.15 KB 768x512 HY_00838.webm)
(525.16 KB 768x512 HY_00862.webm)
>>31312 That is still my plan, but I couldn't give you a timeline unfortunately. I'm a little tight on cash at the moment. Latest development is that I attempted to train v3.00 of ga1n3rb0t, but I'm not stoked on the results. I doubled the video samples in the dataset hoping for more diverse outputs, better prompt adherence - but it didn't turn out quite right. There are some improvements that made it through - bellies appear more pliable and behave more realistic. Clothing is also easier to prompt for. But there appears to be a degraded understanding of anatomy, movement has become unpredictable, and although clothing will generate easier, it behaves static a lot of the time - not the effect I was hoping for. It also generates extremely large bellies automatically (without prompting for it), which I'm sure a lot of people here would consider a plus - but this makes it a lot less flexible than the current version - better to have a range with prompting imo. I'm undecided whether I'll release it as it doesn't feel like a major improvement over the current version. I think I'll go back to the drawing board for now, revise the dataset before trying again. I want to re-clip every video I've collected to make it longer (to 65 or 97 frames). Anyways, here's some samples from ga1n3rb0t v3.00 - curious to know what the consensus is. -drewski
Tried running this on my own gpu instead of using any cloud nonsense, and it froze at about 10%, with task manager saying i was at 31/32 gb of memory. I assume that means my pc isnt good enough?
(334.19 KB 736x416 vid_00021.webm)
(335.39 KB 736x416 vid_00017.webm)
(337.48 KB 320x608 vid_00012.webm)
(207.84 KB 736x416 vid_00008.webm)
(335.14 KB 736x416 vid_00005.webm)
(441.40 KB 736x416 vid_00003.webm)
>>31322 It should be released! You paid for it, and honestly, a default BEEG lora would be great to have. >>31311 There really weren't all that many more, but here's a few from that first session couple of night's sessions on my own machine. Since I rented that runpod, I had 3 days I could access a GPU and then it's been filled since then... haven't wanted to redownload all the models/lora on a another runpod/pay for network storage.
>>31324 Or that you need to modify the tile_size and overlap paramenters down, as well the the initial starting resolution before I2V. Even on an a40 runpod you have to be pretty careful about what settings you use with the YAW workflow or you can still OOM.
>>31322 I'm actually fairly interested in making my own Hunyuan video lora myself... I have over 3TB of videos sitting around from the old array, I feel like I could make a great generalized fat lora, but I don't even know where to really begin on making/training a dataset.
>>31328 Hell yeah dude, try it out. Posted this link previously, it's still the best resource I've found: https://www.youtube.com/watch?v=wVTZj-RGIXw He covers the entire process including dataset prep. Once you make one, it's actually a super simple process to get the hang of. Dataset prep just takes forever. I've been captioning by hand so far. There are gpt programs for video captioning but I haven't tried any yet. -drewski
>>31326 hey btw can you please make a strongfat vid of Vi, or Sevika or Ambressa?
>>31339 i mean cuz it looks like it needs an strong pc ;-;
or someone knows how to make vids on Hunyuan without needing to run it localy for free?
New version of ga1n3rb0t posted to my civitai page. https://civitai.com/models/1144518 ga1n3rb0t v2.1 (Experimental). "Experimental" because it isn't as stable as the previous version when it comes to outputs. Make sure to read the version notes on the page for more info. -drewski
hey Hunyuan have been able to make air inflation yet?, I mean cuz i always wondered if it would be possible to make a video of Bowsette being inflated like a balloon in live action using AI, as a reference to the Marip Party 2/Super Mario Party minigame Bowser Balloon Burst
>>31381 Read the posts before, there was inflation in work, but its barely makes current inflation and its takes several tries, you could ask this guy to keep making the inflation one but he said he would focus on the weight gain one iirc.
Anyone willing to do a request? I want to use this ai but I don’t have my pc rn
(819.51 KB 416x736 vid_00048.webm)
Anyone able to get the more specific lora's to generate heavier women?
does anyone know how I would download a civitai lora when using runpod? I've tried to upload a few times, but the file is too big and my wifi is too slow for it to fully upload. Thanks in advance!

Delete
Report