The future of vocaloid software

The migration of voice sets

A bit of history

Let me preface this with saying I have not been super invested in to the finer

details of the VOCALOID software itself. As someone who lacks musical talent,

I found most of the software to be more of an 'implementation detail'. However

recent shifts in the software being used have been interesting so I figured

I would try and speak on this a bit.

The first important thing to note about the software used in vocaloid is that the

DAW and voice sets are owned by different companies. The VOCALOID DAW is made and

sold by Yamaha, where a voice set like Kizuna Akari is made AH-Software. In this regard

it seems that, at least to me, that these companies that seel the voice sets themselves

act as more "talent agencies" rather then software houses.

In the begining there were two dominant sets of software, VOCALOID itself as well as the

software used for UTAU. With UTAU being a free(as in beer) alternative. Things started

to change a bit starting with the V3 variants of the Cryptonloids(Miku, Rin&Ren, etc)

in which the voice sets came bundled Piapro Studio. Piapro can either be used as a

standalone piece of software or could be used in a seperate DAW through VSTi. However,

with other companies that produced voice sets having a bit less resources

(or as a result of not being fully invested in vocaloid), I believe the investment

in to a similar solution as Piapro studio was not feasible. From what I have heard,

the experience of the VOCALOID software has not aged very well, making things more

difficult then they should be. I am curious whether the availability of Piapro studio

has contributed to the continued use of Cryptonloids by newer artists.

Talk Talk Talk

Another interesting development is the found interest in having vocal sets "Talk".

I have mentioned in the past how some newer artists have incorperated the talking

components in to newer pieces of music, but it's also grown a specific popularity in

of itself. With a growing popularity of game playthoughs spoken over by "talkoids".

I am also sure the premise of otaku buying the voice sets to have them read whatever

they like help hasten the adaption as well. From my knowledge the first premiere software

for making use of this component of talkoids was VOICEROID made by the aforementioned

AH-Software.

Voice sets brought to you by skynet

While I think the VOCALOID software itself was 'good enough' to sell the early parts

of vocaloid, I think it started to crack under the weight of people wanting more from it.

Also from what I've heard, just in terms of making the voice sets _sound_ better, the

VOCALOID software has not advanced much. So what we end up with is a community

wanting to move on and adopt new voice sets but the current software scene not being very

supportive of this. There's also a bit of elephant in the room when it comes to complex

computer problems like voice synthesis that has reared its head within the last 5 years

or so. That of course being machine learning, I'd be amazed if you'd been around the tech

industry and have not been subjugated to hearing about all the new advancement in this

area.

Now to get to the point of this post in the first place, the new pieces of software

that are now being released and getting adoption. Those being SynthV, Neutrino, and

ceVIO AI. From what I can tell SynthV seems to be fitting in to the spot of UTAU,

having a shareware version and touting a lot of less popular voice sets, and from

what I can tell no use of the magical AI/Deep Learning buzzwords. Neutrino seems to

be a smaller set of software being made by those solely focused on the AI aspects,

making use instead of .muse files that get passed through a magic C++ program and spits

out .wav files. Of note is the Kiritan neutrino set, which I think has quite the catchy

sound to it.

The last and newest set of software ceVIO AI, I think is the one that shows the

most promise, releasing just this past January, it already seems to have plans

to bring quite a bit of the more popular "secondary" voice sets like IA, Akari and

Yukari. While also claiming to be making use of AI like the Neutrino group. However

unlike Neutrino it features a full software suite which allows the fine tuning of the

voice sets like the VOCALOID software, but with some greater results and usability based

on the user reports I've seen.

Of particular interest, is a new voice set specifically made to target the new ceVIO AI

workstation is KAFU, which has been featured in some of the new songs that have cropped

up within the last month or so. The sound of kafu is quite great, and I think the full

advantages of the AI magic are on display with it.

Conclusion

I think these new set of software coming out will be the perfect 'jolt' the community

needs to both welcome new producers and also giving long time producers more to

work with in their sound. I am very excited for the future both of new voice sets

flowing out as well as the "redos" of the older sets. Of particular interest in the

adoption of the IA voice set which I think has long been neglected since Jun stopped

producing music. I also hope Una(and Internet Co in general) makes the jump over to

the new software.

Links

R Sound Design KAFU

OSTER Project Neutrino Kiritan

_Natural Neutrino Kiritan

OSTER Project ceVIO kiritan

Ishifuro IA