A stylized anime model. This model is the result of attempts to get the textures from MouseyMix, a model that I’m very fond of, on characters with more realistic body proportions. The name ‘Gyoza’ part of ‘Store Bought Gyoza’ comes from a shortening of “Gyokai” + “Zankuro”, two of the artists MouseyMix is trained on. The Store Bought comes from the fact that the ‘bones’ of this model, i.e. the primary compositional element is SD-Silicon, or to be more specific Silicon-29. SD-Silicon is made using autoMergeBlockWeight, an automated block merger, hence the dumpling is “made by machines”. In other words, 工厂做的，商店买的。
The strengths of this model is that you get a very soft illustrative style model with bright texturing but with very powerful scenery and landscapes. V3 is the least overtuned of all the GyozaMix models, it recognizes fantasy, scifi, and normal settings. Though generally it does want to generate very grandiose landscapes so don’t use it if you want something understated.
Everything is on this HuggingFace repository, along with the recipes. I go over all my decision making steps there. If you have anything to ask, don’t be hesitant to leave a comment. I believe in an open-source approach to Stable Diffusion model mixing so I’ll answer any questions you have on the process. It uses Block Weight Merging pretty extensively, so I would recommend you read up on UNet Blocks as a starting point.
But if you want a general rule of thumb, basically what we’re doing is splitting the model mixing process into two parts: a picture composition/landscape arrangement component and a textural component. The former we can make by reverse-cosine Block Weight Merging any sort of highly detailed anime model against Silicon-29. I like to use dpep4 variants. Then, after we tune up the textual component to get the exact sort of texture we want, we cosine Block Weight Merge the textual component against our compositional component. After that, finetune for taste and we’re ready.
Generally speaking you will want to generate at a smaller size and then use some sort of General Adversarial Network based upscaler at a ratio of at, with a denoising ratio of around 0.3 to 0.4. This model can handle Latent upscalers but the result of that is less predictable. If you are using any sort of DPM++ sampler with Latent Upscalers, you need to upscale at a denoise ratio of 0.5 to 0.6, otherwise you image won’t diffuse enough during the upscaling process.
Any sort of Negative prompt should work more or less fine. I like using the EasyNegative embedding because that usually adds a fair bit of detail into it, but as long as you prompt (worst quality, low quality:1.4) as a negative. Because of the training data, there’s a good chance that you’ll inherit some watermarks or signatures. If that bothers you, make sure to make that a negative too.
I can’t really stop you but it’s the wishes of a lot of people whose work went into this that you not monetize their stuff, and since this is a merge of their models, I want to pass that on to you. You can do whatever you want with the model once you download it, but please keep their sentiments in mind.
Visit Official Website