all
a
/
b
/
c
/
f
/
h
/
j
/
jp
/
l
/
o
/
q
/
s
/
sw
/
lounge
cgi
up
wiki
Heyuri!
Bulletin Boards
2D Cute
2D Ero
2D Lolikon
3D Girls
Anime/Manga
Flash
Girl Talk
日本語/Japan
Lounge
Oekaki
Off-Topic
Site Discussion
Strange World
Overboard
Heyuri★CGI
Heyuri★CGI
@PartyII
Battle Royale R
Chat
Chinsouki★
Dating
DevChat
Drama Club
Hakoniwa Islands PvE
Hakoniwa Islands PvP
Polls
Slime Breeder
Web Banana
Web Shiritori
Yumemiru Gambler
Kakiko Checker
Other
Anime Nominations
Banners
Cytube
Heyuri Calendar
Heyuri Wiki
MAL Club
Museum
Steam Group
Uploader
[
Home
] [
Catalog
] [
Search
] [
Inbox
] [
Write PM
] [
Admin
]
Off-Topic@Heyuri
it's the place to be!
[
Return
] [
Bottom
]
Posting mode: Reply
Name
Email
Subject
Post
Comment
File
Animated GIF
Password
(for deletion)
Allowed file types are: gif, jpg, jpeg, png, bmp, swf, webm, mp4
Maximum file size allowed is 50000 KB.
Images greater than 200 * 200 pixels will be thumbnailed.
28
unique users in the last 10 minutes (including lurkers)
Switch form position
|
BBCode reference
Read the
rules
before you post.
Protect your username, use a
tripcode!
日本のへゆり
2026/03/04
-
NEW GAME:
Gekikuukan Powerful League 2
! Create your own baseball team and try to win the league!
2026/02/01
-
Soudane (Yeah) feature has been implemented back
2025/09/03
-
AHoge Editor
is now available in English! You can easily create your own AA with it.
2025/05/04
-
Heyuri Calendar
has been launched. Find out about upcoming Heyuri events!
[
Show all
]
Heyuri is out of maintenance! ヽ(´∇`)ノ
Uploader@Heyuri is going to be back soon too.
File:
m1772488142904.png
(444 KB, 2003x1640)
[
ImgOps
]
VOCALOID1 MFPA implementation
QueueSevenM
◆Tnq5UWtkfs
2026/03/04
(Wed)
18:37:41
No.
175057
Yeah x1
▶
So I've been working on a project to re-implement the VOCALOID1 engine.
I'm basing it on the description in Jordi Bonada's PhD thesis "Voice Processing and Synthesis by Performance Sampling and Spectral Models" and not the original papers as the former is more detailed, easier to follow, and also describes the VOCALOID2 engine.
After a lot of trouble with getting TWM f0 estimation to work, I've finally gotten to implementing MFPA. And amazingly, it seems to have worked first try.
Compare my results:
https://i.ibb.co/dsvgv0fd/Screen-Shot-2026-03-02-at-3-54-48-PM.png
To the results in the study:
https://i.ibb.co/C3fjdWVd/Screen-Shot-2026-03-02-at-3-55-09-PM.png
>>
1
QueueSevenM
◆Tnq5UWtkfs
2026/03/04
(Wed)
18:40:59
No.
175058
+
▶
File:
m1772489166399.png
(282 KB, 1754x1278)
[
ImgOps
]
Also here's the graph of the f0 estimate from the TWM.
It's still somewhat flawed (see the jump down at frame 37), and it required using unusual parameters (Kaiser-Bessel beta 2.2 instead of the 1.95 recommended by the study, and only 6 harmonics instead of 11) to avoid instabilities even in relatively trivial scenarios.
Actually I think I've finally figured out what's been wrong with the TWM the whole time - I forgot to convert the frequency in bins to the frequency in hertz. I think this was originally intentional, because only realized much later that the error formula is neither linear nor even relative. I haven't test the fix yet, however I'd imagine it should finally solve the problems I've had with TWM.
This graph specifically shows the estimated fundamental frequency for each 256-point frame of an E4 /e/ phoneme.
>>
2
Anonymous
2026/03/04
(Wed)
18:44:20
No.
175062
+
▶
日本語でおk
>>
3
QueueSevenM
◆Tnq5UWtkfs
2026/03/04
(Wed)
18:46:26
No.
175063
+
▶
>>175062
What do you mean?
>>
4
QueueSevenM
◆Tnq5UWtkfs
2026/03/04
(Wed)
18:49:11
No.
175065
+
▶
Actually I meant IPA /i/ not /e/.
>>
5
TEH RAPEMAN
2026/03/04
(Wed)
19:24:36
No.
175070
+
▶
I don't understand it well myself but i admire your efforts
ヽ(´ー`)ノ
>>
6
Anonymous
2026/03/04
(Wed)
19:47:55
No.
175073
+
▶
>as the former is more detailed, easier to follow, and also describes the VOCALOID2 engine.
you went with the harder to program paper? what made you choose this one?
also, wat are you planning to do with it afterwards? is it just a programming excercise?
ヽ(゚ρ゚)ノ
>>
7
QueueSevenM
◆Tnq5UWtkfs
2026/03/04
(Wed)
20:10:23
No.
175088
+
▶
>>175073
>you went with the harder to program paper? what made you choose this one?
No, "former" means first in the sentence, not chronologically.
>also, wat are you planning to do with it afterwards? is it just a programming excercise? ヽ(゚ρ゚)ノ
It was just that originally, but I now plan to eventually release it as an open-source library.
>>
8
QueueSevenM
◆Tnq5UWtkfs
2026/03/04
(Wed)
20:11:49
No.
175090
+
▶
>>175070
You'd be surprised. Try just reading the paper from the start. You may find that you can actually understand it.
https://www.tdx.cat/bitstream/handle/10803/7555/tjbs.pdf?sequence=1&isAllowed=y
[
Top
]
Delete post: [
File only
]
Password:
First
[0]
Last