[Home] [Catalog] [Search] [Inbox] [Write PM] [Admin]
(for deletion)
  • Allowed file types are: gif, jpg, jpeg, png, bmp, swf, webm, mp4
  • Maximum file size allowed is 50000 KB.
  • Images greater than 250 * 250 pixels will be thumbnailed.





File: bmi.png
(61 KB, 587x335)[ImgOps]
62531
how many fat lards do we have here
5 posts omitted. Click Reply to view.
>>
File: Screenshot (308).png
(16 KB, 375x225)[ImgOps]
17155
I think I should lose weight soon:sweat:
>>
File: image.png
(42 KB, 389x372)[ImgOps]
43569
fat is evil and shall be destroyed
>>
File: image.png
(43 KB, 781x432)[ImgOps]
44794
almost
>>
>>178875
join teh fat club
>>
>>178880
i renounced my membership around 6 months ago, im not sure if theyll have me again

File: do you suck dicks.jpg
(51 KB, 465x498)[ImgOps]
52639
he really does like the black men having sex, doesnt he?

File: meandwhomst.jpeg
(36 KB, 959x720)[ImgOps]
37348
Me and who?
>>
u
>>
>>178864
n mi, n mi n u (´人`)

File: 32.png
(1.45 MB, 2068x780)[ImgOps]
1523244
Hello Heyuri. If you could get a love doll of TWO of the following but NOT all three, which pair do you pick and why?
1) Hatsune Miku (original KEI design)
2) Megurine Luka
3) Kagamine Rin (Future Style)
9 posts omitted. Click Reply to view.
>>
miku & rin coz they're cute :hokke:
>>
I'd get Miku and Luka (takoluka edition).
>>
Miku and Luka, entirely because I don't like the Future Rin design much.

If it was regular Rin, it'd be perfect, and thus I would pick Miku and Rin.
>>
>>178739
agree :onigiri:
>>
File: rin-white.png
(242 KB, 700x700)[ImgOps]
248477
>>178726
Hmm, black clip or white clip?

File: egg.jpg
(131 KB, 1280x720)[ImgOps]
135034
:tehegg:

File: wamu353.jpg
(39 KB, 600x382)[ImgOps]
40624
O hi yo :waha:
>>
File: x14_004.gif
(59 KB, 167x250)[Animated GIF][ImgOps]
60610
o hai thar

File: Fi9TRNPaYAAOvGd.jpg
(214 KB, 1200x1200)[ImgOps]
220039
Ducks.
>>
DICKS
>>
DICKS!? (;゚Д゚)
>>
File: duckroll.jpg
(48 KB, 598x477)[ImgOps]
49784
DUCKROLL
>>
>>

File: easter.jpg
(4.84 MB, 4000x2950)[ImgOps]
5076161
happy easter heyuri ( ´ω`)
>>
y-you got taiga making chocolate for you!?!? :angry:

also.. happyy easter! :biggrin:
>>
File: taiga.jpg
(941 KB, 1600x1200)[ImgOps]
963748
!
>>
:tehegg:

File: fate order.jpg
(212 KB, 1920x1080)[ImgOps]
217657
as i finished first route - unlimted blade, i wonder if i should go for teh tony hawk route next? I seen some clips of him doing 360s and it lookz cool. But i still have Sakuras route. I am thinking of just watching teh movies instead of finished the vn. Thoughts?

File: rei.jpg
(420 KB, 1193x843)[ImgOps]
430950
behold!
>>
Plain characters like me are only good enough for cloned manko... Fine, I'll have sex with teh Rei! :cry:
>>
>>178542
she's older than you sick fuck
>>
File: 1390895179602.jpg
(22 KB, 500x508)[ImgOps]
22915
AAAAAAHHHHHHHHHH WTF IS THAT
>>
File: cant get up im gay.jpg
(31 KB, 720x405)[ImgOps]
32055
>>
what is my waifu doing here????

File: Screenshot 2026-04-03 013921.png
(44 KB, 1255x232)[ImgOps]
46048
>>
I could not ヽ(;´Д`)ノ
>>
File: strawberry panics 2.jpg
(292 KB, 2586x497)[ImgOps]
299836
UPDATE: I could ヽ(´∇`)ノ

YURI POWAR!!!
>>

Hello, I'm back and I have a major update to my VOCALOID project! I have sucessfully achieved a shape-invariant pitch transposition!

Here it is.
First the original audio: https://files.catbox.moe/zmt3rr.wav
Now my version with WBVPM (pitched down by an octave): https://files.catbox.moe/kho97n.wav
And a version using a naive pitch shift: https://files.catbox.moe/xs39bq.wav

Notice that my version, while having more noise, sounds more natural and has less phasiness. This is particular noticeable if you play both at very low volume. One sounds much more 'human' than the other.

Also note that this an extreme example with an octave shift (or 1200 cents) - in practice, shifts would typically be far less. Also this doesn't implement several other parts of the system (more on that later).

I'll explain all of this in a moment, but first, I'd to correct some major biographical errors. Since this is a long post, I've divided it into sections

BIOGRAPHICAL CORRECTIONS

In the last post, I claimed that VOCALOID1 used Narrow-Band Voice Pulse Modeling while VOCALOID2 and onwards used Wide-Band Voice Pulse Modeling. This was incorrect, and additionally it was the source of most of my confusion surround the paper.

What actually happened is that the research technology that would later become VOCALOID1 started out as work to improve the existing Spectral Modeling Synthesis system that had been developed in the early 1990s. This improvement began work in the late 1990s. But importantly, this system evolved and techniques from it were incorporated with techniques from a system that was being developed called a Phase-Locked Vocoder, and this system would be released as VOCALOID1. In the mid-2000s, work began on combining the techniques learned from improving SMS and the PLVC-based system and attempting to combine them with the mucher older and well-known TD-PSOLA system. Importantly, TD-PSOLA (Time-Domain Pitch Synchronous OverLap and Add) was a time-domain system, while SMS was a frequency-domain system (and also TD-PSOLA was pitch synchronous - hence the name, while SMS had a constant hop size). The first technique they developed was Narrow-Band Voice Pulse Modeling, and later Wide-Band Voice Pulse Modeling. Wide-Band Voice Pulse Modeling ended it up being used in VOCALOID2.

Now that I understand this, I also understand the major mistake I made when reading the paper: I was reading it from the perspective of an implementer, thinking of the sections as the steps to implementing it instead of as research. I had thought that section 2.2 described the core processing algorithms. When it was actually about SMS, and importantly, about *the improvements they made to SMS*, and not a complete description of SMS, since SMS was already an established technique. Hence my confusion on why some things were seemingly vaguely explained, since *the paper wasn't about them*. At the same time, much of that section is very useful though because importantly, much of that research was also incorporated into the later techniques.

RESULTS

I have successfully implemented the Wide-Band Voice Pulse Modeling; synthesis; and pitch transposition, time stretching, and timbre scaling algorithms. Additionally, I have also finished implementing the full version of the pitch estimation module, changed the code to work using overlapping windows, implemented the window adaption system, and fixed countless.


Comment too long, view post No.178285 to see the full comment.
Marked for deletion (Old)
10 posts omitted. Click Reply to view.
>>
I wish I was as interested in anything as OP is in whatever he's talking about :dizzy:
>>
>>178483
Well actually I'm implementing the techniques that were used for VOCALOID2. But a VOCALOID1-like engine could be an interesting future project.
>so much work was put into our silly vocaloid voices we really should be grateful it even exists huh...
https://www.tdx.cat/bitstream/handle/10803/7555/tjbs.pdf?sequence=1&isAllowed=y
>>
>>178483
Wait what happened to my tripcode??
>>
>>178626
sorry

i ate it
>>
Hello I'm back with another update to my VOCALOID project. It's not as big an improvement as last time - and in fact, there's no new features - but I felt like it was worth posting. I've been trying to rectify the major issues before I move onto implementing the Excitation plus Resonance model.

The first thing I attempted to tackle was all the added noise at high frequencies.
Here's the original spectrum: https://files.catbox.moe/fq55bo.png
And here's the reconstructed spectrum (with no transforms applied): https://files.catbox.moe/gq7jff.png
You can clearly see the high frequency artifacts. The first thing I tried was something mentioned in the paper. In the paper, specifically the WBVPM section, it was mentioned that there are two approaches for a non-integer size discrete fourier transform. The first one is repeating the signal while second is upsampling it. I went with second as the former is patented and also because the second is easier to implement. It is mentioned that increasing the repetition count of the signal (or in the case of upsampling, the upsampling factor), and then discarding the higher frequencies, can improve the estimation by reducing artifacts. In the case of repetition, it is also mentioned that quadratic interpolation can be used in the resulting spectrum, however I am not sure if this can be done for upsampling and as such, I have not tried to implement it for now.

Here's the result after applying an upsampling factor of 3: https://files.catbox.moe/qcgnzq.png
Here's the original audio: https://files.catbox.moe/f7g8ta.wav
The original reconstruction: https://files.catbox.moe/da0m1i.wav
And now with the improved reconstruction: https://files.catbox.moe/513ycn.wav
You can see an improvement, especially at lower frequency, however the high frequency artifacts largely persist. So they have to be arising elsewhere. I realized the source was the reconstruction of the signal (AKA the "synthesis"). I had previously implemented a synthesis method that was quite different from the one used in the study, because I did not understand the method in the study at first. My synthesis method worked by taking each voice pulse and for each sample where the voice pulse is the closest voice pulse to that sample, setting the value of that sample to the interpolated value of a spline representing a time domain version of the upsampled voice pulse with a step corrospondin between the ratio a sample in the regular time domain and the upsampled time domain. Now, in some cases, estimation inaccuracies and differences from any transformations that were applied result in these regions of samples being bigger than the actual sample itself. In these cases, we take advantage of the period nature of the voice pulse and repeat it (i.e. sampling before the start is equivalent from that offset from the end, and sampling after the end is the same as that offset from the start). However, this method results in discontinuities in some cases.
Here is an example of such a discontinuity: https://files.catbox.moe/jnnxfj.png
I began to try to implement an interpolation system. In this system, we could calculate the gap between pulses - or in the cases of inaccuracies in the other direction (i.e. overlapping pulses) - the overlapping area, and interpolate between one pulse and the other linearly. However, this was approach was complicated significantly by the non-integer (and potentially differing) sizes of the pulses as well as numerous edge cases. For this reason, I struggled to do so and spent over an hour trying to figure out how to do it corrrectly. About half way through, I decided to check the paper again and this time I understood the actual synthesis method properly, largely because of a diagram I had missed the first time.
In the actual method, each pulse is is expanded in a manner similar to that of the border interpolation technique used in WBVPM analysis, except kind of in reverse. In this technique, for each voice pulse, we generate extensions on both sides with each extension having the size of the border interpolation ratio of the size of the voice pulse. Then we apply a trapezoidal window to the voice pulse which starts at zero at each side of the extended voice pulse and becomes 1 on either side after protrusion of twice the border interpolation size on each side. Then we overlap and add the voice pulses.
This technique fixes the discontinuity issue because it effectively results in each border-interpolation-length side of each voice pulse being interpolated with the corrosponding section for the other voice pulse linearly over a period of twice the border interpolation size. However, this only holds perfectly when the fundamental frequency is the same for both voice pulses (and thus they are the same size) and they are spaced out at onsets that are exactly the period of the fundamental frequency apart. However, when this in not the case, some amount of modulation occurs that results in some voice pulses being attenuated while others are accentuated. This is especially noticeable when there are large inaccuracies in the fundamental frequency estimation and/or the voice pulse onset sequence.
Here's the same section from before. Notice how now it does not have a discontinuity: https://files.catbox.moe/p26914.png
Now here's a zoomed-out version: https://files.catbox.moe/zacw8w.png
Now here's a section with large inaccuracies in the MFPA estimation that clearly shows large modulation artifacting: https://files.catbox.moe/efk1vx.png
Here's the new spectrum: https://files.catbox.moe/f94zse.png
You can see that while the high frequency artifacts are now gone, there are now more low frequency artifacts. In fact, the overall amount of artifacts is actually higher than before.
Here's the reconstructed audio: https://files.catbox.moe/ympfi0.wav
While I ended out solving this issue by fixing large inaccuracies in the MFPA system, it is interesting to note that my approach is more resilient to estimation inaccuracies. Perhaps for a future improved vocal synthesizer, it would be worth exploring a variant of my periodic continuation technique adapted with an interpolation method that could handle changes in pulse onset and f0.

The first thing I tried was switching to a magnitude-limited logarithmic scale for the ampltiude in the MFPA function instead of it being linear. However, this resulted in little to no effect. The next thing I tried was adjusting the size in periods of the window used for the peaks that are fed into MFPA, however again this resulted in little to no effect. Next, I tried implementing the harmonic peak selection algorithm I proposed in the previous post, but again this resulted in little to no effect.

Comment too long, view post No.178806 to see the full comment.

File: chensmug03.jpg
(34 KB, 474x474)[ImgOps]
35240
heh, you are not a coffee~
how lame. :lolico:
>>
File: satorismug03.png
(291 KB, 474x474)[ImgOps]
298809
is what you would like to think, but too bad! I'm a fair share of coffee too, myself!
>>
ehhh~ thats lame coffee! who would wanna drink shit diarrea!?!
>>
File: shikanoko.jpg
(495 KB, 1878x2048)[ImgOps]
507024
キタ━━━(゚∀゚)━━━!!

BRAND NEW social website idea concept. Steal it. DELIBERATELY unlike social medias or imageboards.

A brand new person opening the website is greeted on a page filled with a plethora of unknown symbols; colourful logos all of the same size. Sigils if you will. But more like seals, like flags. All same size (if circle, all of them are in circles) No names. No other identifiers. Just the seals. just intuit which means what. pure intuition. The seals dont seem to mean anything. They're just a seemingly gibberish image, ALWAYS.

The only other text on the screen is the instruction to pick one carefully because once picked that's the Party you start with. A Party is a group of min 4 up to 6 people.

Once in a Party, you find out that you can't speak (type) aside from SINGLE emoji per post. The Party members start to give you Questions with Answers Choice. After some time answering, you find out you've gained some really basic Words. Like pronouns, prepositons, grammar words, adverbs. But of course, not "Life" words (Words other than for grammatical purposes', like objects in real world). The Party, if they're satisfied and not decided to kick you out, let you stay in it. You find you're limited in the number of words you can use per post, but as time goes on, the numbers increase.

The rest of this website is at this point free-for-all ideas on what next to implement, basically; after this starting point, there are two essential things:
a) gaining Words. The website owner decide how this is to be done.
b) the chance for a Party to one day meet other Parties that are also using the site. The website owner decides how this is to be done.

The whole meaning of this thing is to give meaning to our own Speech on the internet, that is, to be able to even SPEAK at all, one must WORK towards it. The site is, yes, 'gamified', it's deliberately contrary to the whole concept of social websites in that the more you spent energy and time on the site the more that you're able to speak and express yourself. The more you put effort, the more freedom you have.

You are fully CENSORED at the very beginning, speaking basic grunts and mereemojis, and only by your own effort you build up your own FREE('d) Speech from absolute nothing. Like I said, the whole concept of the website is deliberately opposite to social media and imageboards where you enter and you can start saying anything you like (100/100). Here, you begin right from zero (0/100).

Continuing a bit. From afar you see some people co

Comment too long, view post No.178545 to see the full comment.
7 posts omitted. Click Reply to view.
>>
File: scrot_000.png
(130 KB, 1168x651)[ImgOps]
133625
>>178555
>Personally I'm kind of a fan of the old Android emojis
Same, I use them as the emoji font on my computer. The blobs were cute.
Although I tend to use emoticons when I just want something simple like :) or :D
>>
OP here.
Everything Ive said is of course alterable by those who can code the thing.
But to centralize the point, the main concept of the site is exactly inverse to everything else thats available today that a person can signup and then say whatever they want to whoever they want at any place (threads) they want. No, mine is everything is closed off from you by default. You can sense they are there (imagine faint blurred pixels), but you can't access them yet.

Anyway, to acutize the vision of some possible concepts..
- User grinds Words and Length of posts. Basically like the words per post in twitter or chans, user starts limited, and grow from there
- In the beginning there are only few 'Boards' one can travel in. Perhaps maybe one, 'Home Board' if you will, and boards have relative distances, thus it's easier to get to /sci/ from /x/ than to /mu/.
- At first user will only see what's happening on that particular day, and as time goes on, user will be able to see yesterday's posts, then 3 days earlier, then 2 weeks earlier, this goes on until even years earlier.
- There are no banning because of insulting words. Bad Words are basically privilege, those who got it are actually Frequent Timespender of the site, thats how they got them in the first place. The earned it. Racial slurs are RARE shiny pokemons, the mechanisms of getting this is unknown, but users understand that only someone who spends a lot of time on the site only finds them.
This mechanism avoids the whole concept of mods keeping a website altogether, there's no need, "if you want to evilpost, spend more time in the site and earn it first".
- No such thing as microtransaction shit. Money ruins things. le 4ch's vip pass is part of the downfall.
- Everything depends on Spent Time. Thus in the website Oldfags always have the best privileges.
- The whole Party system does feel hazy to me at this time. As time goes on, you stumble onto other individuals, and you understand because they're like you, they're of course part of some other party too. The Part is probably a removable concept; Im thinking because this World is "Total Limitation by Default at the Start", the amount of people you get to see are very limited at the start, but maybe they dont have to be "party". Maybe it could be a thread youre stuck with. or a 'place', whatever that is. But point is, you're stuck, and you work your way out.
But another point of this feature is I want a kinda forced socialization. Like how "you're good friends with th

Comment too long, view post No.178694 to see the full comment.
>>
File: peoto.jpeg
(91 KB, 1883x694)[ImgOps]
93278
does this look good, op?
>>
>>178700
haha thats cool but bit too simplistic, tbh i was thinking something as ornate as this https://en.wikipedia.org/wiki/Mon_(emblem)#Gallery_of_representative_kamon_by_theme
>>
>>178714
Hmm.. I’ll see what i can do

File: stalin.jpg
(63 KB, 550x548)[ImgOps]
64514
did u know stalin was into roris? :x3:
>>
they dont care about my cold exterior!
>>
he could have doned better


Delete post: []
[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] [19] [20] [21] [22] [23] [24] [25] [26] [27] [28] [29] [30] [31] [32] [33] [34] [35] [36] [37] [38] [39] [40] [41] [42] [43] [44] [45] [46] [47] [48] [49] [50] [51] [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] [66] [67] [68] [69] [70] [71] [72] [73] [74] [75] [76] [77] [78] [79] [80] [81] [82] [83] [84] [85] [86] [87] [88] [89] [90] [91] [92] [93] [94] [95] [96] [97] [98] [99] [100] [101] [102] [103] [104] [105] [106] [107] [108] [109] [110] [111] [112] [113] [114] [115] [116] [117] [118] [119] [120] [121] [122] [123] [124] [125] [126] [127] [128] [129] [130] [131] [132] [133] [134] [135] [136] [137] [138] [139] [140] [141] [142] [143] [144] [145] [146] [147] [148] [149] [150] [151] [152] [153] [154] [155] [156] [157] [158] [159] [160] [161] [162] [163] [164] [165] [166] [167] [168] [169] [170] [171] [172] [173] [174] [175] [176] [177] [178] [179] [180] [181] [182] [183] [184] [185] [186] [187] [188] [189] [190] [191] [192] [193] [194] [195] [196] [197] [198] [199] [200] [201] [202] [203] [204] [205] [206] [207] [208] [209] [210] [211] [212] [213] [214] [215] [216] [217] [218] [219] [220] [221] [222] [223] [224] [225] [226] [227] [228] [229] [230] [231] [232] [233] [234] [235] [236] [237] [238] [239] [240] [241] [242] [243] [244] [245] [246] [247] [248] [249] [250] [251] [252] [253] [254] [255] [256] [257] [258] [259] [260] [261] [262] [263] [264] [265] [266] [267] [268] [269] [270] [271] [272] [273] [274] [275] [276] [277] [278] [279] [280] [281] [282] [283] [284] [285] [286] [287] [288] [289] [290] [291] [292] [293] [294] [295] [296] [297] [298] [299] [300] [301] [302] [303] [304] [305] [306] [307] [308] [309] [310] [311] [312] [313] [314] [315] [316] [317] [318] [319] [320] [321] [322] [323] [324] [325] [326] [327] [328] [329] [330] [331] [332] [333] [334] [335] [336] [337] [338] [339] [340] [341] [342] [343] [344] [345] [346] [347] [348] [349] [350] [351] [352] [353] [354] [355] [356] [357] [358] [359] [360] [361] [362] [363] [364] [365] [366] [367] [368] [369] [370] [371] [372] [373] [374] [375] [376] [377] [378] [379] [380] [381] [382] [383] [384] [385] [386] [387] [388] [389] [390] [391] [392] [393] [394] [395] [396] [397] [398] [399] [400] [401] [402] [403] [404] [405] [406] [407] [408] [409] [410] [411] [412] [413] [414] [415] [416] [417] [418] [419] [420] [421] [422] [423] [424] [425] [426] [427] [428] [429] [430] [431] [432] [433] [434] [435] [436] [437] [438] [439] [440] [441] [442] [443] [444] [445] [446] [447] [448] [449] [450] [451] [452] [453] [454] [455] [456] [457] [458] [459] [460] [461] [462] [463] [464] [465] [466] [467] [468] [469] [470] [471] [472] [473] [474]