MusicRevU`s guide to UTAU
Transcription
MusicRevU`s guide to UTAU
MusicRevU’s guide to UTAU Introduction ......................................................................................................... 2 Creating your UTAU ........................................................................................... 3 Installing UTAU ............................................................................................................................ 4 CV (Consonant – Vowel) Voice banks ............................................................. 6 Sounds to record ......................................................................................................................... 6 VCV (Vowel-Consonant-Vowel) Voice banks ............................................... 13 Sounds to record ...................................................................................................................... 14 The next steps ................................................................................................... 21 OTO.ini ............................................................................................................... 23 Last steps .......................................................................................................... 27 UTAU Flags ........................................................................................................ 30 Using USTs ........................................................................................................ 35 Mixing ................................................................................................................. 40 Wiki creation ...................................................................................................... 58 Character Profile Template ............................................................................. 62 Changing System Locale ..................................................................................................... 63 How to type in Japanese Hiragana ................................................................................. 63 How to create a UTAU .......................................................................................................... 63 Tutorials about Mixing ........................................................................................................... 64 Software ........................................................................................................................................ 64 UTAU Voicebank for best reference .............................................................................. 64 UTAU User Guide .................................................................................................................... 65 UTAU wiki .................................................................................................................................... 65 How to make an UTAU sound better ............................................................................. 65 1 Introduction Hello there! I’m Hoshi, I go under the alias of MusicRevU for all my UTAU work and I am creating this little tutorial to help any future UTAU users. When I was creating Minuet, I found it quite difficult to understand a lot of the tutorials out there and as such I decided to share my experiences and provide some details on how to create an UTAU! These are just little things I found on my own and learned how to use UTAU mostly through two of my friends who use the software. So please read carefully and let me know if this helps you! 2 Creating your UTAU Alrighty so we begin with the planning of your UTAU. It is usually good to begin with a concept of your character so you have a rough idea how they look and ideally how you want them to sound. An example would be when I created Minuet; I gave her a slightly gothic look to her, as I wanted her voice to be soft and mature. Once you have your concept out of the way, starting making some little notes for your UTAU such as their name, their name in Japanese and so on. Don’t worry though I will provide a template for you on what info is required for you UTAU later on in this tutorial! So you’ve now got a rough idea of what you want your UTAU to look like, now the next step is to start recording this bad boy!! Yup, this is where the most work has to be done! So to start off I will list what you need to download: 1. UTAU - http://utau2008.xrea.jp/index.html to download the latest version. 2. UTAU English patch - http://utau.wikia.com/wiki/UTAU_wiki:UTAU_GUI_Translation 3. Audacity - http://audacity.sourceforge.net/download/ 4. Lame for Audacity (to export mp3) - http://lame.buanzo.org/ You can use any kind of audio software to record and mix, I personally prefer Audacity because I learned to use it at University and I can use its interface very well. So you have gotten everything you need!! For now of course and now we begin the next step; what sounds you need to record your Voice 3 Banks and the two common Voice Banks you will find in the UTAU community. Installing UTAU Now I know for some first time users, you might encounter some problems installing UTAU. This is because it is a Japanese Unicode Program. What does this mean? The programming is different to display Japanese characters, which is where Unicode comes in! It’s all about programming and to be honest it can be hard to understand if you’re not very computer literate. I knew what it meant due to the fact my mum was a technician and taught me how to use computers; everything else was self-taught. So first of all what you will need to do is change your locale if you’re using Windows. I don’t know how to do this for Mac due to the fact I can barely work UTAU synth. This is a guide to changing the locale on Windows 7 but I’m sure it’s the same kind of method for any Windows computer! The system locale determines the default character set (letters, symbols, and numbers) and font used to enter and display information in programs that don't use Unicode. This allows non-Unicode programs to run on your computer using the specified language. You might need to change the default system locale when you install additional display languages on your computer. Selecting a different language for the system locale doesn't affect the language in menus and dialog boxes for Windows or other programs that do use Unicode. 1. Open Region and Language by clicking the Start button , clicking Control Panel, clicking Clock, Language, and Region, and then clicking Region and Language. 2. Click the Administrative tab, and then, under Language for nonUnicode programs, click Change system locale. If you're prompted for an administrator password or confirmation, type the password or provide confirmation. 4 3. Select the language, and then click OK. To restart your computer, click Restart now. (Taken from the Microsoft website; Linked in the Useful Links section) So that’s how you do that and by hitting shift+alt you can alternate between Japanese and English. Be sure to do this when you’re using Unicode programs cause you want everything to run smoothly. This will mean Voice Banks like Teto or Defoko (default with UTAU) should run, as they are only Hiragana based. You can find lots of useful tutorials on how to type in Japanese Hiragana with the Locale changed! Find the link in the Useful Links section! If you click on the A when you’re in Japanese Locale, you will get these options. The one with the H for it lets you input Hiragana so this is handy if you wanna type Japanese but your this UTAU’s can be name a in little complicated to understand so only do this if you are confident enough to input romaji to translate to Hira. 5 CV (Consonant – Vowel) Voice banks CV stands for "Consonant-Vowel". It is the traditional recording system of UTAU, being designed for the Japanese language. Sounds consist of either a single V, "vowel," sound (a, i, e, o, u and n included) or "consonant-vowel" (ka, ji, no, etc). CV (Consonant - Vowel) is the most common and first voice bank type, created by Ayame/Ameya. It's a pretty simple voice bank type, known for being just Japanese syllables recorded, each in an own wave file. You would use a CV voice bank like this: [a][ri][ga][to] With these sounds you can create the word "Arigato". You will find most beginners start out with this sort of Voice bank because it is the simplest to make. Remember, the sounds don't make the UTAU; it’ll be the OTO.ini which I will explain later on. I am now going to list the sounds you must record to make a CV voice bank. If you want an idea of how long to record the sounds, take a look at some UTAUs Voice banks like Namine Ritsu or Utaune Nami and open the sounds to hear them as well as use them as a reference. Sounds to record • Breath (Romaji alias of “Br”) – Breath ↑ • a–あ • i–い • ye – いぇ • u–う • wi – うぃ • we – うぇ • wo – うぉ 6 • e–え • o–お • ka – か • ga – が • ki – き • kye – きぇ • kya – きゃ • kyu – きゅ • kyo – きょ • gi – ぎ • gye – ぎぇ • gya – ぎゃ • gyu – ぎゅ • gyo – ぎょ • ku – く • kui – くぃ • kue – くぇ • kuo – くぉ • kua – くぁ • gu – ぐ • gui – ぐぃ • gue – ぐぇ • guo – ぐぉ • gua – ぐぁ • ke – け • ge – げ • ko – こ • go – ご • sa – さ • za – ざ • shi – し 7 • she – しぇ • sha – しゃ • shu – しゅ • sho – しょ • ji – じ • je – じぇ • ja – じゃ • ju – じゅ • jo – じょ • su – す • sui – すぃ • sue – すぇ • suo – すぉ • sua – すぁ • zu – ず • zui – ずぃ • zue – ずぇ • zuo – ずぉ • zua – ずぁ • se – せ • ze – ぜ • so – そ • zo – ぞ • ta – た • da – だ • chi – ち • che – ちぇ • cha – ちゃ • chu – ちゅ • cho – ちょ 8 • ji – ぢ • je – ぢぇ • ja – ぢゃ • ju – ぢゅ • jo – ぢょ • tsu – つ • tsi – つぃ • tse – つぇ • tso – つぉ • tsa – つぁ • zu – づ • te – て • ti – てぃ • tyu – てゅ • de – で • di – でぃ • dyu – でゅ • to – と • tu – とぅ • do – ど • du – どぅ • na – な • ni – に • nye – にぇ • nya – にゃ • nyu – にゅ • nyo – にょ • nu – ぬ • nui – ぬぃ • nue – ぬぇ 9 • nuo – ぬぉ • nua – ぬぁ • ne – ね • no – の • ha – は • ba – ば • pa – ぱ • hi – ひ • hye – ひぇ • hya – ひゃ • hyu – ひゅ • hyo – ひょ • bi – び • bye – びぇ • bya – びゃ • byu – びゅ • byo – びょ • pi – ぴ • pye – ぴぇ • pya – ぴゃ • pyu – ぴゅ • pyo – ぴょ • fu – ふ • fi – ふぃ • fe – ふぇ • fo – ふぉ • fa – ふぁ • bu – ぶ • bui – ぶぃ • bue – ぶぇ 10 • buo – ぶぉ • bua – ぶぁ • pu – ぷ • pui – ぷぃ • pue – ぷぇ • puo – ぷぉ • pua – ぷぁ • he – へ • be – べ • pe – ぺ • ho – ほ • bo – ぼ • po – ぽ • ma – ま • mi – み • mye – みぇ • mya – みゃ • myu – みゅ • myo – みょ • mu – む • mui – むぃ • mue – むぇ • muo – むぉ • mua – むぁ • me – め • mo – も • ya – や • yu – ゆ • yo – よ • ra – ら 11 • ri – り • rye – りぇ • rya – りゃ • ryu – りゅ • ryo – りょ • ru – る • rui – るぃ • rue – るぇ • ruo – るぉ • rua – るぁ • re – れ • ro – ろ • wa – わ • o–を • n–ん This includes all the extra sounds you can record and a little tip is to record in romaji, that way it makes it easier for you to know what sounds you have definitely recorded. When you record in romaji, then be sure to put your aliases in UTAU to hiragana and vice versa. The best reference you can use as I said is to study other UTAU voice banks and that will give you an idea of how the sounds are meant to sound. (I had never spoken Japanese before this and had to study the sounds, as Japanese syllables sound different to English syllables.) I used Utaune Nami’s, Namine Ritsu’s and Kyohakushi Alice’s Voicebanks to get my sounds right. 12 VCV (Vowel-Consonant-Vowel) Voice banks VCV stands for "Vowel-Consonant-Vowel". It is a phoneme technique used to record UTAU voice banks, also called "triphones" or "triphonics." By recording strings of syllables and using otos to split them up, one can crossfade vowels together before consonants for sound that flows more naturally. Example: "aRiGaTo" becomes "a ari iga ato", rather than "a ri ga to". VCV (Vowel - Consonant - Vowel) is also a quite common voice bank type, created by Ayame/Ameya. This voice bank type is a bit more confusing and not recommended to people who just started using and making UTAUloids. A VCV voice bank, it's usually recorded in syllable strings, mostly has 5 or 7 syllables. Most the time the configurations in the oto.ini look like this: [i あ],[a た],[i そ] With those VCV strings you can create a chain of sounds to blend with each other and sound smoother. For example: [- a][a ri][i ga][a to] With these strings you can create the word "Arigato", and it will sound smoother than a CV voice bank. And thanks to the natural pronunciation available in a VCV voice bank, the utau voice banks sound more "human". (It doesn't reduce the noise of resamplers) VCV’s are the next step to do for a voice bank AFTER you are familiar using an UTAU for a while. This is NOT recommended for beginners as it requires a lot of work to the oto.ini. There are also several ways to record this type of Voice bank starting with 2 Mora up to 7 Mora. Because in the future I aim to make my UTAU sound as realistic as possible, I’ll note down the 7 Mora list provide links later to the necessary Voicebanks to research. 13 Sounds to record • a_a_i_a_u_e_a • e_e_u_o_e_o_o • o_u_n_a_n_u • ke_ke_ku_ko_ke_ko_ko • za_za_ji_za_zu_ze_za • zi_zi_ju_ja_je_zi_je • zu_zu_ji_zo_za_zo_ji • chi_chi_tsu_ta_te_chi_te • che_che_chu_cho_che_cho_cho • cha_cha_ti_cha_chu_che_cha • chu_chu_ti_cho_cha_cho_ti • cho_chu_n_cha_n_chu • ku_ku_ki_ko_ka_ko_ki • i_i_u_a_e_i_e • i_i_yu_ya_ye_i_ye • ye_ye_yu_yo_ye_yo_yo • wi_wi_u_wa_we_wi_we • u_u_i_o_a_o_i • u_wi_wo_wa_wo_wi • we_we_u_wo_we_wo_wo • wo_u_n_wa_n_u • ko_ku_n_ka_n_ku • kye_kye_kyu_kyo_kye_kyo_kyo • ki_ki_ku_ka_ke_ki_ke • ki_ki_kyu_kya_kye_ki_kye • kya_kya_ki_kya_kyu_kye_kya • kyu_kyu_ki_kyo_kya_kyo_ki • kyo_kyu_n_kya_n_kyu • so_su_n_sa_n_su • be_be_bu_bo_be_bo_bo • ka_ka_ki_ka_ku_ke_ka 14 • gye_gye_gyu_gyo_gye_gyo_gyo • gi_gi_gyu_gya_gye_gi_gye • gi_gi_gu_ga_ge_gi_ge • gya_gya_gi_gya_gyu_gye_gya • gyu_gyu_gi_gyo_gya_gyo_gi • gyo_gyu_n_gya_n_gyu • gu_gu_gi_go_ga_go_gi • ge_ge_gu_go_ge_go_go • she_she_shu_sho_she_sho_sho • shi_shi_su_sa_se_shi_se • sha_sha_si_sha_shu_she_sha • shu_shu_si_sho_sha_sho_si • sho_shu_n_sha_n_shu • je_je_ju_jo_je_jo_jo • ji_ji_zu_za_ze_ji_ze • ja_ja_zi_ja_ju_je_ja • ju_ju_zi_jo_ja_jo_zi • jo_ju_n_ja_n_ju • si_si_shu_sha_she_si_she • su_su_shi_so_sa_so_shi • zo_zu_n_za_n_zu • ta_ta_chi_ta_tsu_te_ta • da_da_ji_da_zu_de_da • se_se_su_so_se_so_so • go_gu_n_ga_n_gu • sa_sa_shi_sa_su_se_sa • dye_dye_dyu_dyo_dye_dyo_dyo • ti_ti_chu_cha_che_ti_che • di_di_dyu_dya_dye_di_dye • tsi_tsi_tu_tsa_tse_tsi_tse • zi_zi_du_za_ze_zi_ze • tse_tse_tu_tso_tse_tso_tso 15 • ze_ze_du_zo_ze_zo_zo • tso_tu_n_tsa_n_tu • zo_du_n_za_n_du • tsu_tsu_chi_to_ta_to_chi • ya_ya_i_ya_yu_ye_ya • yu_yu_i_yo_ya_yo_i • ga_ga_gi_ga_gu_ge_ga • ji_ji_zu_da_de_ji_de • dya_dya_di_dya_dyu_dye_dya • dyu_dyu_di_dyo_dya_dyo_di • dyo_dyu_n_dya_n_dyu • te_te_tsu_to_te_to_to • zu_zu_ji_do_da_do_ji • de_de_zu_do_de_do_do • mo_mu_n_ma_n_mu • tsa_tsa_tsi_tsa_tu_tse_tsa • za_za_zi_za_du_ze_za • tu_tu_tsi_tso_tsa_tso_tsi • to_tsu_n_ta_n_tsu • yo_yu_n_ya_n_yu • du_du_zi_zo_za_zo_zi • do_zu_n_da_n_zu • ra_ra_ri_ra_ru_re_ra • hye_hye_hyu_hyo_hye_hyo_hyo • hya_hya_hi_hya_hyu_hye_hya • hyu_hyu_hi_hyo_hya_hyo_hi • hyo_hyu_n_hya_n_hyu • hi_hi_hyu_hya_hye_hi_hye • hi_hi_fu_ha_he_hi_he • nye_nye_nyu_nyo_nye_nyo_nyo • rye_rye_ryu_ryo_rye_ryo_ryo • nya_nya_ni_nya_nyu_nye_nya 16 • rya_rya_ri_rya_ryu_rye_rya • ryu_ryu_ri_ryo_rya_ryo_ri • nyu_nyu_ni_nyo_nya_nyo_ni • nyo_nyu_n_nya_n_nyu • ryo_ryu_n_rya_n_ryu • ni_ni_nyu_nya_nye_ni_nye • ri_ri_ryu_rya_rye_ri_rye • ri_ri_ru_ra_re_ri_re • ru_ru_ri_ro_ra_ro_ri • ni_ni_nu_na_ne_ni_ne • na_na_ni_na_nu_ne_na • re_re_ru_ro_re_ro_ro • nu_nu_ni_no_na_no_ni • ne_ne_nu_no_ne_no_no • ro_ru_n_ra_n_ru • no_nu_n_na_n_nu • wa_wa_wi_wa_u_we_wa • ha_ha_hi_ha_fu_he_ha • ba_ba_bi_ba_bu_be_ba • pa_pa_pi_pa_pu_pe_pa • n_zi_n_je_n_jo • n_chi_n_te_n_to • n_n_i_n_e_n_o_n • n_i_n_ye_n_yo • n_wi_n_we_n_wo • bye_bye_byu_byo_bye_byo_byo • n_ki_n_ke_n_ko • n_ki_n_kye_n_kyo • n_gi_n_gye_n_gyo • n_gi_n_ge_n_go • n_shi_n_se_n_so • n_ji_n_ze_n_zo 17 • n_si_n_she_n_sho • n_ti_n_che_n_cho • n_di_n_dye_n_dyo • n_tsi_n_tse_n_tso • n_ji_n_de_n_do • bya_bya_bi_bya_byu_bye_bya • byu_byu_bi_byo_bya_byo_bi • byo_byu_n_bya_n_byu • n_hi_n_hye_n_hyo • n_hi_n_he_n_ho • n_ni_n_nye_n_nyo • n_ri_n_rye_n_ryo • n_ri_n_re_n_ro • n_ni_n_ne_n_no • n_bi_n_be_n_bo • bi_bi_byu_bya_bye_bi_bye • n_bi_n_bye_n_byo • bi_bi_bu_ba_be_bi_be • n_fi_n_fe_n_fo • n_pi_n_pye_n_pyo • n_pi_n_pe_n_po • n_mi_n_me_n_mo • n_mi_n_mye_n_myo • n_vi_n_ve_n_vo • fye_fye_fyu_fyo_fye_fyo_fyo • fi_fi_fyu_fya_fye_fi_fye • fi_fi_fu_fa_fe_fi_fe • pye_pye_pyu_pyo_pye_pyo_pyo • fe_fe_fu_fo_fe_fo_fo • fo_fu_n_fa_n_fu • fyo_fyu_n_fya_n_fyu • pyo_pyu_n_pya_n_pyu 18 • fya_fya_fi_fya_fyu_fye_fya • fyu_fyu_fi_fyo_fya_fyo_fi • pya_pya_pi_pya_pyu_pye_pya • pyu_pyu_pi_pyo_pya_pyo_pi • ze_ze_zu_zo_ze_zo_zo • fu_fu_hi_ho_ha_ho_hi • bu_bu_bi_bo_ba_bo_bi • fu_fu_fi_fo_fa_fo_fi • pi_pi_pyu_pya_pye_pi_pye • pi_pi_pu_pa_pe_pi_pe • pu_pu_pi_po_pa_po_pi • he_he_fu_ho_he_ho_ho • pe_pe_pu_po_pe_po_po • fa_fa_fi_fa_fu_fe_fa • me_me_mu_mo_me_mo_mo • mu_mu_mi_mo_ma_mo_mi • bo_bu_n_ba_n_bu • ho_fu_n_ha_n_fu • po_pu_n_pa_n_pu • ma_ma_mi_ma_mu_me_ma • mye_mye_myu_myo_mye_myo_myo • myo_myu_n_mya_n_myu • mya_mya_mi_mya_myu_mye_mya • myu_myu_mi_myo_mya_myo_mi • mi_mi_mu_ma_me_mi_me • mi_mi_myu_mya_mye_mi_mye • vi_vi_vu_va_ve_vi_ve • ve_ve_vu_vo_ve_vo_vo • vo_vu_n_va_n_vu • va_va_vi_va_vu_ve_va • vu_vu_vi_vo_va_vo_vi • wo_wo_a_wo_i_wo 19 • u_wo_e_wo_n_wo In terms of how to alias these, I would suggest looking to Namine Ritsu’s Voice bank for reference or Utaune Nami’s. 20 The next steps All right so you have your reclist for you voice bank and are ready to record!! This is where I suggest having a very good microphone to record; condenser microphones are recommended for more professional work as it comes with noise reduction but a good USB microphone will also suffice. I used a simple plug in microphone to record my UTAU but I have plans to update her when I get a new microphone. Be sure to record in a quiet environment to avoid any kind of background noise leaking into your sounds. This will avoid any problems with your UTAU singing as background noise makes it sound very robotic and it is not pleasant to hear. Now as I mentioned before it’s always good to reference other voice banks so you know you are along the right lines. When you record in Audacity be sure to trim any silence off the start and end if needed and to make your UTAU sound a tad more realistic, try fading in and out the beginning and end of the notes because it makes it smoother. Always export the audio as wav as this is what’s read for your UTAU in the UTAU software! Now when you save, you can save in either romaji (ka, ki, ko etc.) or in hiragana (か, き, こ etc.). I recommend saving in romaji if you do not know Japanese as it makes it easier to track what sounds you have recorded. I personally prefer it in romaji because I can barely read Hiragana at the best of times. I have however reversed this as I work on Minuet’s ACT 2. Make sure you spend time on the recordings and be sure to check out the tutorials I’ll link to later on for some really good tips to use UTAU! 21 So wonderful!! You now have all the sounds to your UTAU completed! I bet you’re thinking you’re done now huh? WRONG! Now comes the longest and probably most tedious part if you have no idea what you’re doing. This part is called the OTO.ini. 22 OTO.ini The oto.ini can be VERY confusing at first – especially if you're totally new to UTAU. The oto.ini is there to melt the notes together and add life to the UTAU, as well as some final editing for the wavs. There are lots of other great oto.ini tutorials out there, and I'll add links to those I found that I think could help after the tutorial. To open the UTAU oto.ini interface, do this: In the menu window, click "tools" and "Voice Bank Settings...". This will open up this window: As you can see, I've selected the sample "a" to work with. The selected sounds are marked with blue, the numbers you see on the screen all describe how the oto is built. To open the "working interface", click "Launch Editor". Done! 23 This is the UTAU oto.ini interface – you have the first window and the second one, the second window opens when you push the large button placed near the four smaller ones (it's the big one to the left, see?). The second window is where we edit the oto.ini – the first window is for information on the oto, duplicating files and adding hiragana/romaji aliases. As you can see, there's a lot of stuff going on here that I'll try to explain. BLUE - the blue part indicates the part we want to delete from the file, for example if you recorded "zu" but forgot to erase the empty space in the beginning/end of the recording. That's where the blue goes. It's very handy. PINK - this part makes sure that your samples won't go "nnnnaaa" instead of "naaa", the pink goes over to consonant and partly over the vowel. 24 RED LINE - overlap, how much of the note/sound shall overlap the previous one. GREEN LINE - This part is "preutter", IE. when the sound will actually start. For example, I want my UTAU to say "sama". I add the "sa" and "na", and the pre-utter turns it into "sa..ma" with a short pause for me. It's very convenient for realistic sounds! NOTE!! NEVER drag the red and green lines TOO FAR or they'll sound OFF and BAD, experiment before releasing a voice bank with a newly configured oto.ini!! STEP BY STEP Here's the opened sound we want to Oto. As you can see, there's nothing more than the sound file at the moment. We drag the pink over the consonant + some of the vowel. 25 Then we add the overlap and preutter like this, a bit after the consonant but not too far into the vowel. Experimenting is the best way to work an oto.ini. The green should NEVER be too far into the sound or it will sound horrible, it's better if you leave the green somewhere in the beginning on the consonant. Perfect! Now we add the blue line over the parts we don't want - like stuff we should've deleted in Audacity but didn't for some reason, or just sounds we don't want. Last but not least we drag the blue stuff over the very end of the sound. DONE! 26 Oto'd Voice bank vs. No Oto Voice bank: The first pic is of an oto.ini voicebank singing "arigato", the second of a voicebank with no oto.ini. See the differences? The Oto.ini makes it go "a..riGaa..too", and the no oto goes "arigato". The first one sound a lot more realistic and nice. Every voicebank should have a working oto before it's released. (Oto information and pics taken from – http://purutau.blogspot.co.uk/2010/12/utau-tutorial-otoini-configuring-cv.html) If you have any UTAU friends, ask them to test the voice bank for you to see if it sounds ok. Make sure you do lots of testing before you release your UTAU! Last steps So now that your voice bank is complete it’s time to do some of the misc work before you start using usts. First of all we need to make a separate txt file that will hold the character information! So go to notepad (if you’re using Windows) or text edit (if you’re using a Mac) and type the following: 27 • name = (enter the name of your UTAU; can be romaji or hiragana) • image = (the image must be 100x100, saved as a bmp and you put the file name here. Eg “Profile_image.bmp”) • author = (your name here!) • web = (your website here so this can be deviantArt, tumblr etc.) • sample = (if you have a sample of your UTAU then you put that here. Same way as the profile image, you just put the file name here. Eg “sample.wav”) Once you’ve done that be sure to save it as character.txt. If you use any kind of hiragana in this be sure to switch your settings to Unicode so that it reads in UTAU ok. The next txt file needs to be created now and this will contain information about your UTAU. It can be plain and simple or it can be detailed like I did with mine, so I will provide the layout that I put in mine. Create your file and input this: • Terms and conditions – This is just some simple knowledge such as is it CV or VCV? Who created it? Is there more information available and if so link to the page? Do you allow edits? Copying? Be sure to list everything here if you want to be specific about your rights. • Make a solid line to separate the info since underneath this you want to put character traits. • Name: (Your UTAU’s name) • Age: (Your UTAU’s age) • Height: x’x”ft (xxxcm) (this is to cover the metric system as well) • Weight: xxxlbs (x’x”stone) • Gender: (This can be anything obviously, it’s not restricted to male or female) • Voice Range: (You don’t have to enter this but it can help users if your UTAU 28 works better at a certain range. You can find this information out in UTAU when you first run it.) • Flags: (Do you use flags? Be sure to list them here. Don’t worry; a list of available flags will be listed later.) • Resampler: (Does your UTAU sound better with different samplers? List here) • Character Item: (Every UTAU and Vocaloid has an item! Be sure it matches your UTAU’s personality and/or design) • Personality: (List your UTAU’s personality here.) • Likes: (What does your UTAU like?) • Dislikes: (What does your UTAU hate?) • Related characters: (Only list those you have either created or have gained permission for.) Be sure to then save it as readme.txt so that UTAU will read it and the results are as followed! With that out of the way, your UTAU’s voice bank is now completed! Celebrate and rejoice as there is no more work required here! Just remember these steps whenever you make a new voice bank and if you plan on making appends, ALWAYS DO RESEARCH! It helps in the long run. 29 UTAU Flags Flags are probably amongst the most difficult part of UTAU to understand as they can either help a Voice bank or destroy it. Now the Flag I use for Minuet I found somewhere online to help her sound a bit clearer due to the fact I didn’t have a good microphone but I suggest experimenting a bit with them. The most common flag use is the Gender Factor. I usually use g-2 or g+2 for Minuet depending on the pitch of the UST. I will explain them with a useful chart I found in the User Guide for UTAU. I will link it in the useful links section. In UTAU, you can perform a variety of tonal marks by entering various flags like "g-5H30Y0" in the "Flags" text box of "Sound Properties" 「音のプロパテ ィ」. Note: You must enter half-width uppercase or lowercase letters in Flags. Also please note that the case is significant (uppercase and lowercase are different types of Flags). In addition, there are Flags that are valid with all the resamplers (singing synthesis engines), and flags that are valid only for the latest generic Resampler version and for the development versions (resampler7, resampler8). 30 All the valid resampler Flags: Flag Base Setting value range Feature - How to it set up effectively A flag to control the effect offered by the formant (the voice quality determined by the structure of the throat or mouth). (Strictly speaking, this is different from VOCALOID2's gender factor, but it has the same effect.) g 0 -100 .. +100 Make sure to set this flag value with the + or symbols appended, like e.g. g+10, g-10. When setting positive values, the voice becomes deeper, more mature and masculine (with +20 or more, a female voice can become male.) When setting negative values, the voice becomes thinner, more childlike and feminine (with -20 or less, a male voice can become female.) Flag to adjust the pitch in 10 cents (1/10th of t 0 -9 .. +9 semitone) units. Make sure to set values with the + or - symbols appended, like e.g. t+5, t-5. The part outside of the fixed range in consonants is called breathiness. (Details are omitted.) By specifying a small value like e.g. Y0, the breathiness part of consonants becomes relatively stronger, and the articulation is considered to be Y 100 0 .. 100 better. (As a side effect, noise appears that makes high notes sounding metallic, thus increase the flag value, or adjust simultaneously the H low-pass filter described below.) Note: If you use a continuous sound source, specifying a small value like e.g. Y0 causes noise, so please leave it to the default value Y100. 31 A low-pass filter to emphasize the bass and cut the treble. (When using together the C, D, E low-pass filters described below, they produce the same H 0 0 .. 99 effect.) It has the effect of mitigating the metallic noise on high notes, but as a side effect the sound becomes muffled. A low-pass filter operating outside of the breath component of consonant (breathiness). As it emphasizes the high frequencies of consonant components, it is unsuitable to sound sources where h 0 0 .. 99 the consonant component is unstable. Note: If set too strong, voices becomes hoarse even with sound sources in which the consonant component is stable, and you need to reduce the value of the Y flag. This adjusts the strength of the formant filter. The formant filter depends on the frequency defined by "source frequency * specified value". F 3 0 .. unspecified It is generally better not to touch it, but when noise appears in low tones, you can suppress it by specifying values around F4 .. F7. This flag is valid for Resampler development versions too, but changes are not as big as in the default generic Resampler version. A fixed frequency for the "F" flag above. The formant L None 0 .. unspecified filter depends then on the frequency defined by "170Hz * specified value". When used simultaneously with F, this value takes precedence. Flags that are valid only for the latest generic and the development Resampler versions: 32 Flag Base value Setting range Feature - How to it set up effectively BRE adjustment after the formant filter. BRE changes are loosened when its pitch is very different from the pitch of the primary b 0 0 .. 100 sound, and the voice becomes unpleasantly rough. In addition, because it is not influenced by the low-pass type filters (C, D, E, H and h), the sound coming from BRE is bad and muffled. A low-pass filter especially reducing the high- C 0 0 .. 100 frequencies. When set to 100, the volume is 100% at 0kHz, 50% at 11kHz, and 0% at 22kHz. A low-pass filter cutting the midrange. D 0 0 .. 100 When set to 100, the volume is 100% at 0kHz, 0% at 11kHz, and 100% at 22kHz. A low-pass filter cutting the bass and treble. E 0 0 .. 100 When set to 100, the volume is 100% at 0kHz, 0% at 7.1kHz, 100% at 11kHz, and 0% at 22kHz. c 50 0 .. 100 The value of the C flag before the formant filter adaptation. Peak compressor. Align the peak volume of the primary sound. (The volume setting and the envelope changes are applied separately.) There is zero variation with a value of 100. P 86 0 .. 100 When set to 99 or less, the variation produced is proportional to the volume of the primary sound and to the parameter value. Because only the peak volume of the primary sound is aligned, a sense of instability in the 33 volume will remain in sound sources with unstable volume changes, even if it is set to 100. W unspecified unspecified Produces a robotic voice. This flag is highly experimental, and is generally not used. (Source - http://utau.wikia.com/wiki/UTAU_User_Manual_-_7 ) As I stated before, experiment with flags but just be careful of the values. Plus flags don’t make a UTAU, they are there to help but a good voice bank and good OTO makes a good Voice bank. 34 Using USTs With your brand new voice bank we can finally make them sing! Yep this is the fun part but it can also be frustrating. Remember, if you use someone’s UST, ALWAYS CREDIT THEM!! They worked hard on this and need to be recognized. Never plagiarize, you will be found out immediately. So use Google, YouTube or nico douga to find USTs. For this I will use the UST for Palette as done by HaruVampire on YouTube. When you open it, you want to make sure you load the UST up with your preferred Voice bank as UTAU will automatically set Defoko as the voice or try to use the Voice bank that originally created the UST. As you can see, HaruVampire’s UTAU comes up so you want to be sure to change it. If you have any preferred flags for your UTAU be sure to set them here. Don’t be afraid to experiment. Click ok and load it up. Your window should look like this! Wonderful, now onto the next steps. You now want to make sure the UST fits your UTAU as every UTAU has different settings that might not mix well with the current settings on a UST. ALWAYS check a UST in case it 35 has settings already or it’ll cause problems later on. If you find some of the fields are filled in then do the following: 1. Select ALL of the UST by hitting ctrl+a. This will allow you to edit the track. Then right click on it. 2. This drop down menu should appear. You want to go to the very bottom where it says Region Property and click on that. It should bring a new pop up window. 3. So you don’t want to touch Intensity or Modulation. these NEVER touch unless you are making a UST because you can really mess up the balance for the song. If Preutterance and/or Overlap is greyed out, this means it won’t fit your UTAU. You want to hit the clear button for that. 4. There’s a little link that says “Details” on it. Click that to drop down the rest of this window. 5. You’ll see that it has some more to it; BRE, Flags and STP. BRE adds breathiness and can make your UTAU sound robotic unless that is what you are aiming for. I suggest clicking on the box and hitting space to clear it. 6. Always clear flags since you don’t know what flags the creator has used and might make your UTAU sound distorted. You should have set your own flags if you read my previous steps. 36 7. ALWAYS CLEAR STP! This feature subtracts milliseconds to the head of the preutterance. In other words it’ll make it sound like it’s not pronouncing correctly if there’s too much STP. When you’re done setting everything up, be sure to hit ok to save changes. 8. Then you’ve finished up you want to then hit these three buttons at the top of UTAU in this particular order: a. ACPT (Apply automatic parameter adjustment) b. P2P3 (Set crossfade envelopes by p2 and p3) c. P1P4 (Set crossfade envelopes by p1 and p4) d. ACPT (Apply automatic parameter adjustment) 9. That will make the UST fit nicely to your UTAU! 10. Now for the final steps that I personally do but you don’t have to if you don’t want to. Make sure everything once you again select with ctrl+a. 11. Click on the Tools menu and navigate down to Built-in Tools and click on A LA CARTE. Now this bit can be confusing to a beginner but trust me it will help in the long run. I was kindly told about this by my friend felipone. 12. You want to make sure the box next to “Connect vowels smoothly to previous note!” is checked. Click all the hiragana boxes as this will cover the vowels, then add into the “Others” box the hiragana for “wo” – を and then in English “a e i o u n wo”. Now the settings can be anything you want but try to make it work for your UST. Leave Rising Note and 37 Falling Note unchecked, for me I set the timing to Medium, Rising+Falling Note Change to Medium. Now I encountered a massive problem with HaruVampire’s UST for Palette which was there was too much vibrato. I fixed this with A LA CARTE by checking the box next to “Add Vibrato!” and set it to Little. Set Frequency to Medium and the duration to Medium. Once you have all your settings, be sure to click all so it’ll apply to the whole UST. You can also use it on just selected areas! 13. And for the final step, navigate to the Tools menu once more, go to Built-in Tools and click on Crossfade. Make sure only the Crossfade box is checked and put into the target the hiragana for the vowels (あ, え, い, お, う), n (ん) and for wo (を). Then after that type in English the vowels, n and wo. This will make everything flow into each other nicely and will help the UTAU pronounce better. Set the Width to 100msec and set Start to -70msec. These are my settings all the time and are the last step I do for my UST editing. 14. Now sometimes you might get an exclamation box over some notes even after all the editing. If this happens, right click on the note with this box and right click on it. Navigate t o e n v elope on the drop down menu to bring up a new pop up box. Now you 38 want to make sure the four red boxes aren’t overlapping each other or doing anything funky, so just click on the Normal button and hit OK and that should fix it. Be sure to go through the whole UST to fix problems like these or your UTAU will sing way off key. 15. There you go! All finished! Now all you have to do is render this out! Go to the Project and click on the Render wav File option. Pick your save location, name it as necessary and save. Sorted! Now you can close UTAU and we can finally go onto Mixing. 39 Mixing All right this is probably the most fun and also must FRUSTRATING part of making a song!! You want to have some mixing software and there’s a wide range available! There’s Audacity, Reaper, MAGIX Music Maker and so on. Choose one you think you can become familiar with. For me it’s Audacity and as previously stated, this is due to using it A LOT at university. I can navigate it fairly well and I find I can make a lot of nice effects with it due to the amount of wonderful tutorials there are on YouTube. So first things first, make sure you have an off vocal for the song you are making. These can be found around YouTube and are normally supplied with the UST. I had to find mine for Palette as it wasn’t supplied in HaruVampire’s UST. A handy thing to have is the actual song and you’ll see why as we progress. So how do we mix? Ok here are the steps: 1. Import all your audio into your mixing software. For this tutorial I’ll be using Audacity. I normally just click and drag what I need into it. Click ok to any pop up windows. Don’t change anything; you want it to remain intact. 2. Mute your audio (including any harmonies) and then import your Off Vocal. 40 3. Be sure to mute the Off Vocal when it’s in and if you have the actual song, import that. Why? This is going to help you with your timing! 4. Now that everything is in let’s make sure we get everything on time. You want to click this little button and start moving the Off Vocal to match up to the timing of the actual song. Try your best to match the wavelengths and you should have no problems. 5. Next is to line up your audio. Now in the case of this UST, I had to find where it started singing in the actual song and then moved the audio until it was in time to the song. A note to make is that not all USTs need this done, as some creators are nice enough to time it for you. Neemiso is one of the best at doing this. Do the same with your harmonies so they are in alignment to your audio! 6. Ok with that done and us all time perfectly, you can remove the actual song from Audacity by clicking on the “x” and be sure to unmute everything so you can hear it. Now to get it in to that nice song you have dreamed of making. Now you will need to select the Off Vocal and be sure to go to the Effect menu and pick Amplify. Set Amplification to -1.5 and click OK if highlighted. If not, click “Allow Clipping” then OK. This will lower the volume on the Off Vocal so it’s not too over powering. Do this how many times you need to but always be sure to balance the Off Vocal with the Vocals. 7. Now select you Vocals! This includes harmonies! This is where we make sure your singer is optimized. Click on Effect and select Reverse. This will reverse your Vocals but trust me, this is an essential step. 41 8. Next click on Effect and click on Compressor. Don’t change any of the settings, just hit OK. 9. Go to Effect again and this time select Equalization. Again don’t change any of the settings, just hit OK. 10. Once more, go to Effect, select Normalize. Don’t change the settings and hit OK. 11. Now we just Reverse it back so click Effect and go to Reverse. There we go, your Vocals are now nice and clear! 12. A lot of songs sometimes require you to use Effects on them such as to make a radio sounding voice and so on. I can’t help here, I will advise you look up tutorials on the kind of effect you are after but I will tell you the effect I always use on my Vocals. 13. With the Vocals still selected, go to Effect and select Echo. Play with the settings you ideally would like but I tend to set my Delay Time to around 0.1 – 0.2 and the Decay Factor between 0.2 – 0.3. If I use more it’s usually to make the echo more present in the song if it calls for it. 14. This is the most time consuming part of making a song as we need to make sure the Vocals and Off Vocal are in harmony. You can use 42 these little bars to change the sound a bit to make them sound good to you. If I for example set my Vocals to -2dB, I will set my Harmonies to around -4 – -7dB. This means it’s not over powered. I have a cut off for my Vocals as well. If they exceed this line, then I use Amplify set to -1.5 to start lowering the volume and I’ll do this however many times I need to until I am personally satisfied. Same goes to the Off Vocals. I usually set my Off Vocals to around 2dB but some songs might need less. 15. Once you are done with all the mixing it is finally time to export it. Now the reason I said before to download Lame is so you export in MP3. This has better sound quality and is more commonly used for songs nowadays. So go to File, select Export Audio. 16. Save it to your preferred location and be sure to name it the finished song. So an example would be I would save mine as “Palette_feat_Minuet_Aoi” so that I knew it was done. 17. Once you hit save, make sure to hit ok for the next window and then you have the Edit Metadata screen. Here is where you add information about the song if you want. I always do since I’m a bit OCD that way so this is how mine would look. Once you’re happy hit OK and it’ll begin to export to your save location. 43 18. And that’s it for Mixing. Always save your projects on the off chance something happens or if you want to edit it. Trust me, it can help if you decide to change anything to the vocals. As you can see it becomes easier the more you do this because you learn with each new UST you work on. ALWAYS CREDIT THE CREATOR OF THE UST!!! I can’t stress this enough but please, credit them and link to the original download or to actual creator’s page. So you are now an UTAU user, I bet you think you’re done now? Nope! There’s one more thing to do and that is to create a wiki page for your UTAU on the UTAU wikidot. This community is better than the previous UTAU wiki and you can create a page easily enough! A useful guide on mixing that I found on a UTAU forum! It explains everything a beginner needs to know so you guys can follow this one too! Audacity and LAME: Audacity is open-source, has an excellent wiki, and solid functionality. We'll be using Audacity 1.3.12 Beta for this guide. You'll also need LAME, in order to save mp3 on Windows. Follow these instructions. Audacity Recording and Editing Basics: The easy stuff: • Recording via the recording button (toolbar) • Importing your BGM (just drag and drop is fine) • Adjusting microphone sensitivity with the slider with the microphone icon (toolbar) • Zooming in and out in time (a.k.a the x axis) with the +/magnifying glasses (toolbar) • Zooming in and out in the y axis by clicking and shift+clicking on a track's y axis labels 44 • Selecting a piece of audio with your mouse, cutting if necessary (ctrl+x) • Selecting the whole track by clicking on the space where it says "Stereo, 44100 Hz" etc. • Changing the graphical display of a track to Waveform, Waveform (dB), etc. (that triangle button beside the name of the track) • Making the track display area bigger by dragging the bottom edge of the track. This will make the y axis labels display more information, which will be useful for mixing. • Adjusting the volume (or gain) of a track (slider on the left side of the track with -/+ signs) Before Mixing - Timing: Nothing says "I didn't practice enough" more than starting to sing half a second after when you were supposed to, or finishing a phrase with a syllable or two still unsung. Sure, you can blame it on the music starting too suddenly or something, but you don't see better singers doing it =P. It's possible to edit timing, but it's much easier to fix it by practicing your singing. • Recognize: Play your vocals and the original song at the same time (import them both into audacity). Make sure they start at the exact same time by zooming in in time and cutting out bits of silence at the beginning. It'll be obvious during playback, which parts you messed up the timing on. • Recognize: Audacity sometimes has a infuriating habit of adding a bit of delay in front of your recording as it prepares itself to record. This can range from 40 to 400 milliseconds. Even 40 milliseconds of delay is noticeable, so it's up to you to make this right! Zooming in in time and looking at the waveform helps. 45 • Prevent: Nothing you can do about the Audacity lag, other than getting a better computer =P. As for your own singing, suck it up and practice! =D Before Mixing - Pitch: You're off-tune? And you hope mixing and editing can save you? You're right, but it's hard. Very hard. Harder than five-year-old cheese. Audio engineers do it for pop idols that can't sing any better than your favourite nico singer (actually most nico singers are probably much better than the likes of Hannah Montana). But you're neither an audio engineer nor a pop idol (yet), so you'll have to do with good ol' fashioned practice... • Recognize: If you've got a good ear, you'll hear it. If you don't have a good ear, someone else will hear it, so ask. How do you know if you have a good ear? This test can tell you. • Prevent: Practice practice practice... it's hard, I know. Pay attention to the pitch you're producing, try singing a bit more slowly, watch out when you go high or low, whatever you notice you're weak in, practice it. Before Mixing - Clipping: All microphones have a certain level of maximum sound energy they can convert to electrical energy to send to your computer. Any difference between the energy received and the energy sent on is simply lost. As a result, the recording of such strong vocals sound like they're missing bits of signals, as if they've been "clipped" out. Clipping is best fixed by properly setting the sensitivity on your mic, and not by mixing or editing. • Recognize Clipping: Change the graphical display of your track to "Waveform". Do any of those waves touch the ceiling or floor? If so, you might have a little bit of clipping. If you find that the 46 waves are hugging the ceiling or floor for seconds at a time, you've clipped, man, and you've clipped BADLY. • Prevent Clipping: Turn down the sensitivity of your mic (that slider in the toolbar, with the microphone icon next to it) and rerecord until your vocals no longer have waveforms that touch the ceiling/floor, especially during the loud parts. If you think this makes the soft parts too soft, I know already you're going to like the compression section of this guide =D. • Prevent Clipping: Also, make sure you're not so close to the mic you're about to devour it. If you have a regular mic, put it to the side of your mouth, instead of directly in front, to avoid "boom" sounds caused by breathing into the mic. If you have a condenser mic, consider a pop filter. You know, one of these. • Okay, so maybe you can fix clipping a little bit: Krystal doesn't like it, but *whispers* Effect -> Clip Fix... Try it out if you only have a little clipping. But be warned, it takes a REALLY long time. Mixing - Noise Removal: There's always a bit of background noise. No, I'm not talking about your brother's yelling downstairs. I'm talking about the hum of your computer and the ambient buzz in the air. Your brain might tune it out for you, but the microphone will not. Unfortunately it's hard for the computer to distinguish noise from voice, so with any noise removal process there comes a little distortion in vocals. The skill in mixing here comes from the right balance between noise removal and voice preservation. • Select a few seconds of the noise you want to remove, and go to Effect -> Noise Removal. Click "Get Noise Profile". This tells Audacity what noise to remove. • Now select a portion of your vocals and go to Effect -> Noise Removal again. 47 o Noise reduction (dB): How much to reduce the noise by. More reduction means less noise, but also more voice distortion. o Other settings: Don't worry about them until you're pro enough. (Actually, I only know what they do, but not how to use them effectively. The default settings work fine though.) • Use the preview button. You'll notice it only gives you the first few seconds of whatever piece of audio you selected. This'll help you adjust the settings until you're satisfied with the effect. • Once you're satisfied, remember the settings and click cancel. Now select the whole track and Effect -> Noise Removal again. Enter the settings you decided on and click OK. Mixing - Compression: Dynamic range refers to the difference between the volume of the loudest sound and the softest sound. Raw vocals have a HUGE dynamic range, much larger than your BGM, in most cases. That's why oftentimes if your verse is just right your chorus gets too loud, or if your chorus is just right you can't hear the verses anymore. Compression "compresses" the dynamic range so the two are closer together in terms of volume, thus blending in with the BGM which has a similar dynamic range. The mixing skill here is to reduce differences in volume, but not so much that you can't hear the differences between powerful vocals and "sweet" vocals. • Change the graphical display of your vocal track to "Waveform (dB)" (remember that little triangle thing next to the track name?) and drag the bottom edge of the track until you have lots of informative labels displayed on the y axis. Don't be afraid to make the track so large as to fill the screen. • You'll notice general differences in volume between various parts of your vocals. Record approximately how loud (Audacity 48 records the loudest as 0 dB and the softest as -60 dB) your soft parts and loud parts are. Say you found that they were -25 and 10 dB respectively. • Now, select a portion of your recording you'd like to preview, preferably containing a second of soft vocals followed by a second of loud. For example, the transition to a chorus. • Effect -> Compressor o Threshold: How loud the vocal has to be before compression is applied to it. We want to compress the loud vocals while leaving the soft vocals as they are. For our values of -25 dB soft and -10 dB loud, we'll set the threshold to -20 dB. Thus anything louder than -20 dB (such as our loud -10 dB vocals) will be compressed. o Ratio: How much the vocals to be compressed will be compressed. 2:1 means that the dynamic range of whatever that passes the threshold will be cut in half. o • Other settings: They're fine as they are. Preview, and play with the ratio until the volumes are more equivalent between the soft and loud vocals, but not so much so that you can't hear the difference in power anymore. • Remember the settings, cancel, select whole track, and apply the compressor with the settings you decided on. • There's a curious trend in the music industry to heavily compress the dynamic range so as to get the loudness of every part of the song as high as possible. This makes sense, since it'll be easier to hear high and low frequencies when it's louder. And if two identical songs, one slightly louder than the other, were to be played, the louder one generally is regarded as better. Many people think heavy compression isn't good (I'm one of those people, since I like classical, and play piano. Dynamics is very important... Compressing until only 3dB remain, like TV commercials are, is unthinkable to me.), but that's the way it is right now. Wiki up "loudness war" if you're interested. 49 Mixing - Equalization: Every pitch is identified via a frequency of the sound waves carrying the pitch. The higher the frequency, the higher the pitch you hear. The human voice typically ranges from 80 to 1100 Hz, with low vocals obviously at the lower end and high vocals at the higher end. Men typically have vocals centered between 80-500 Hz (not counting falsetto), and 170-1100 Hz for women (though the women's range covers more Hz, the relationship between frequency and pitch is not one-to-one. For each octave you go up in pitch, you'd have to DOUBLE the frequency; thus more frequency change is needed to go from high C to high D than going from low C to low D). The purpose of the EQ is to boost or diminish the volume of sound based on their frequency. For example, boosting 80 - 200 Hz might make your bass guitar sound more prominent. If your vocals are drowned out by the BGM, you can make the BGM quieter. But as Ciel pointed out, you don't need to make the entire BGM quieter - just the frequencies where they interfere with your vocals' frequencies. In effect creating "space" in the BGM for your vocals. But which frequencies? Well, Audacity has a neat tool... • Select a representative part of your vocals. Say, the first verse, bridge, and chorus together. Now go to Analyze -> Plot Spectrum. • Whoa, it's a graph. Frequencies on the X axis, volume on the Y axis. So if you see a peak at 400 Hz that means there are a lot of notes at around 400 Hz. From this graph you can get a feel for what range of frequencies you're singing in. • Don't fuss about being exact. Once you start getting comfortable with the equalizer, you'll know that ranges, not numbers, are what you'll be working with. • Now, think about what parts of the song is interfering with your vocals. Let's use Campanella as an example. It starts out 50 simple, with hardly any BGM. But in the chorus it builds up and by the third chorus there's drums and cymbals and piano and even rocketships. My vocals rang true in the beginning but were drowned out near the end. • My strategy was to find out my vocals' frequency range (which we just did with Plot Spectrum), then make the BGM quieter in that range whenever I feel like I'm being drowned out, usually the chorus. I ended up doing the EQ with every chorus, and with harsher settings in the final chorus. But how do you work the EQ? • Select the portion of BGM you want to apply the EQ to, and go to Effect -> Equalization o Whoa, it's another graph. Y axis is volume, X axis is frequency. The default line is at 0 (no adjustment) for all frequencies. You can manipulate the line with the mouse. Try it out. For me, I made a small valley of about -5 dB from 150-600 Hz. Your spectrum might be different. o Alternatively, you can select the "Graphic EQ" radial button and have sliders instead of messing with the line yourself. o Other settings: If you're itchy, try them out. Just don't do anything permanent. If not, leave them alone lol. • Preview doesn't do much here, since you need to hear your vocals at the same time. So to experiment here you'll need to apply the EQ, play it back, and if you're dissatisfied, you'll have to undo and do it again. Yeah, it's one of Audacity's weaknesses - but hey, it's free. • Once you're satisfied, click "save as" and name your curve. Now you can apply the same EQ to other parts of the BGM that drown out your vocals. Mixing - Amplification: 51 Every effect affects volume. Noise removal reduces the volume of whatever it recognizes as noise. Compression reduces the differences in volume. Equalization adjusts volume based on frequency. Amplify affects volume much more simply. It's a pure addition/subtraction in volume. The skill in mixing here comes from knowing where your vocals need boosting. Is a specific part too quiet? Is the whole track too loud? • Select something that needs boosting (the low notes that you lacked power in, and can thus barely hear, maybe?) and go to Effect -> Amplify. o Amplification (dB): How much louder/quieter you want it to be. o New Peak Amplitude (dB): Audacity calculates how many dB it will have to amplify your selection to make the peak (loudest part of the selection) whatever dB you entered here, then changes the "Amplification" field to reflect this calculation. End result is normalization (see next section) to whatever dB you entered. By default this is set to 0.0 dB. o Allow clipping: If you check this, you can boost volume above 0.0 dB, but clipping will result. I recommend not checking it. • Preview, adjust, and apply. Play back to make sure you haven't made it too much louder/softer as to make it stand out too much from the rest of the vocals. Mixing - Normalization: Ever had an mp3 that was quieter than most others in your collection? If you were to make the song louder so that the max volume in this song is the same as the max volume in another song (typically 0 dB), you'd be normalizing it. 52 • Normalization is useful, but there's already a way to do it with Amplify, and it's also included in the Compressor, if you remember. Mixing - Gain: Every track has a -/+ slider on the left side its display. If you can't see it, drag the bottom of the track to make the display area bigger. It has the same effect as Amplify, but limited to exactly one track at a time. So why bother? • It can be adjusted on the fly. Meaning you can adjust it while the music is playing. It also displays how much gain you're applying in dB. This is excellent for finding out exactly how many dB of amplification that quiet part of your vocals should get. • Once you know how much dB to amplify, you can put the gain slider back to 0 dB, and use Effect -> Amplify instead to make the actual changes. • Why not use the gain slider to make volume changes? It applies to the whole track whether you like it or not, so if you only want to make one part of the track louder, you're out of luck. Also, unlike Amplify, there is no "allow clipping" checkbox to leave unchecked, so it won't warn you if you're clipping. Mixing - Reverberation: Singing in the bathroom obviously sounds different from singing in your bedroom, which in turn is different from singing in a concert hall. The reason is echo and reverberation. If you compare your freshly noiseremoved, compressed, and equalized vocals with the vocals from some songs, you'll notice that despite all your mixing, your vocals still sound very... naked. Very raw. But that doesn't mean you should make yourself sound like you're in the Globe Theatre. You just have to match the reverb of your BGM, or at least the reverb of the original Miku vocals or whatever. The skill in mixing here comes from being able to 53 add reverb that is pleasing, but not readily noticeable (it'll distract listeners from your beautiful singing, you know?). As in, you'll notice if you compare, but not if you simply listen. • Select a suitable preview section. Preferably containing a second of a few words followed by a long drawn-out vowel. (Kaku yuugoro ni saaaaaaaaa~) • Effect -> GVerb (some people use Echo, but GVerb is more flexible and takes less processing time). o The settings are complicated. You should start out with the presets here. I like "The Quick Fix" for most songs. • Preview. Try to aim for something that sounds pleasantly full, but natural at the same time. Adjust the amount of reverb by changing these settings: o Early reflection level: Loudness of the first echoes. It's once again from -60 dB (softest) to 0 dB (loudest). o Tail level value: Loudness of the echoes of the echoes, as they "die away". This is what makes reverb vocals sound so full and pleasant. Also from -60 dB to 0 dB. • Remember your settings, cancel, select the whole track, and apply the settings you chose. • You might want to hear your vocals again with the BGM, since the preview only plays the track you selected by itself. You might need to undo the GVerb and do it again with different settings. • Sometimes a good pair of headphones can be a liability. What you hear as just the right amount of reverb someone else with just the good ol' iPod headphones might hear as, well, nothing. Sound quality also differs by sound card. You'll just have to learn about these the hard way, though! So once you export your mp3 later, test it out on another computer, or ask your friend to compare two versions, one with reverb and one without. Mixing - Panning: 54 Stereo means different sounds signals can be sent to the left or the right speaker. Biologically speaking, your brain interprets a sound as coming from the left if it receives the sound from the left ear a few milliseconds faster than the right ear. Most mixing programs can create the illusion of your vocals coming from the left or right by panning. • Underneath the Gain slider on each track is an L/R slider that controls how close the sound from this track will sound to the left or right side. • Harmonies and background singing are good targets for panning to the left or right while your main vocals remain centered. • If you're mixing a duet or chorus, there are even more possibilities for panning. Be creative! Part Three Effect - Radio: A.K.A. the "tinny" effect, "walkie-talkie" effect, "buzzy" effect, etc. This one is actually produced with *gasp* the EQ! Theory is to cut off the high and low frequency components of your voice, but there's a convenient preset in Audacity you can make use of. • Effect -> Equalization. In Select curve:, there's "amradio". This simulates the sound from AM Radio stations, which are mostly talk, news, etc. Which makes sense, because if you look at the curve, everything other than the typical speaking range is getting cut. • You can modify this curve if you want, make the slopes sharper, move the peak to a higher frequency if your voice is higher, make the peak cover a smaller range, etc. • Use the preview button to experiment! When you're done, you can save your curve for future use. Effect - Echo: 55 You can also use this for reverb, but GVerb is better for that, and Echo is more intuitive for ... well, echoing. Maybe you have a song that has the dramatic soft to powerful shifts like Starduster and Last Night, Good Night, where the choruses have a bit of echo to accentuate their contrast from the soft parts of the song. Or maybe you wanted to make the last note of the song echo just to sound cool. Just remember to use it moderation! • Select where you want to preview, and Effect -> Echo... o Delay time (s): time between each echo o Decay factor: how much the sound decays with each echo. 0 means complete decay (no echo), 0.5 means each echo is half as loud as the last one, 1 means the echo will never die out. • Fun: to hear what going crazy sounds like, apply a delay time of 1 and a decay factor of 1 to, say, 20 seconds of your singing. =D Effect - Autotune: This basically picks a scale, and "rounds your pitch to the nearest note", to use a mathematical analogy. Human singing, though, is not that simple, and that's why autotuned voices generally sound very unrealistic. But maybe that's what you're going for, like in Campanella. • Audacity doesn't come with autotune, but there's a plugin we can download to do the same thing. ChoAkkar introduced it in the second page of the topic. It's called GSnap. Download it here. And extract the contents of the archive into the plugin folder (located where you installed Audacity) • Restart Audacity and you should now see GVST: GSnap... in the Effects menu. • How do you use GSnap? I don't actually know... I installed it because I thought I might use it for the Campanella audition, but 56 I ended up leaving it pretty prestine. Maybe someone else will contribute? If not, I'll update this after I try it out. Exporting to mp3: Make sure to save your work often! And BTW, Audacity projects can take up a gigabyte of space if you did heavy editing and mixing, so make sure you move some old anime or something. The LAME plugin we installed at the beginning was to allow for mp3 exporting. Go to File -> Export. • You can export any format from the list, but most people will choose mp3 • When you do, click on "options" and change the quality. Unless you're stingy about 10 mb of space, use the highest quality setting (320 kbps). This simply tells you how much sound data will be put into the file to represent each second of music (kbps = kilobits per second). • After you press save, you can enter some tag information. I used to put my name in, but got embarrassed after and now I just leave mine blank. (Taken from http://ytchorus.forumotion.com/t1687-the-beginner-s-guide-tomixing-audacity ) 57 Wiki creation We want to make our UTAU official now so we do need to create a profile on this webpage: http://utau.wikidot.com/ Now you don’t need an account to do this but I find if you do, it’ll show you are the one who made and edited it. But do whatever makes you comfortable. The page should look this this: You want to type in your UTAU’s name the western way as instructed and click on Create. This will then take you to a page where you can enter your UTAU’s information. I’ll help you guys in filling it out if the layout confuses you but it’s pretty self-explanatory. • Title – The Western Name of your UTAU goes here. • Western Name – Same as before. • Eastern Name – This would be where Kanji would go if you have it. 58 • Kana Name – Now this is where the Hiragana for your UTAU’s name goes. Always study how it should look and if you are not sure don’t ever be afraid to ask for help. • Icon – This is the Profile Image you use for your UTAU in the software so just use that. • Image URL for Official Art (PNG, GIF, JPG only) – You can upload your picture to an external site like photobucket, sta.sh and so on. Just be sure to take the Direct Image Link and paste it in that box. • Artist Credit for Official Art – If you did not draw the official art, please give credit to the actual artist. You will be found out if you claim something as yours and it’s not. • Gender – Obviously. Again not reserved to just Male or Female. • Age – Just enter the age here but be sure the age matches the voice. • Release Date – When you released it officially for download. • Official Site – This is the site you would be posting your UTAU related stuff. I personally use Tumblr so you can use anything or even your own website if you can design one. • UTAU Group or Production Team – This really only applies if your UTAU is already part of a group like Vipperloids or something. • UTAU Voicer – This is where you put the alias of whoever voiced your UTAU. If it wasn’t you again be sure to give credit and ask the Voice Provider what alias they wish to use if they don’t want to use their real name. • UTAU Manager – I’m going to assume this means who manages the UTAU. This would be you. • File Encoding (especially in the case of Japanese voice banks) – This is what you named the WAV files when recording your sounds. I recorded Minuet in Romaji so she has Romanized filenames. Just pick whatever applies to you. • OTO.ini Aliasing (especially in the case of Japanese voice banks) – When you make your OTO you would’ve have to add aliases to the sounds like I stated before. So because Minuet was romaji, I had to put her aliases to Hiragana. So again, click what applies to you. 59 • Voicebank configured on – This is what kind of system you used for making your UTAU. I alternated between Mac and PC so I put both. Click what applies to you. • Voicebank Details – So you provide the download link here. Remember to use something you maintain regularly. I use Google a lot so that’s why I chose Google Drive. Be sure to add extra info here such as if it has extra sounds, is it CV or VCV or both?, what resampler you use, the gender factor if it applies and of course the flag you use for this UTAU. • Soundcloud, YouTube or Nico Embed – The name pretty much states it all but most people use Soundcloud for the songs and YouTube for videos. You just need to find your iframe and just Google how to find it if you aren’t sure. • R-18 Content – You need to decide if you allow mature content for your UTAU. • Commercial Use of Voice banks Allowed? – Again your choice but be very careful what you pick because it means people can gain profits from using your UTAU. • Commercial Use of Character Allowed? – Same as the previous option but again be careful cause this makes it very easy for people to steal your character. • Do these Terms apply to derivative voices/characters? – This is like what Akaito is to Kaito, only enable this if you want to give this option but if you set it for permission required then remember you have a right to say no. • Link to additional Terms of Use – This is for if you have further terms so I can’t really explain this one. • Height – In x’x”ft (xxxcm) • Weight – In xxxlbs (xxxkg) • Character Details – This is just some basic information about your UTAU. You can put anything put I used a template for info and then just typed that info into the box. I’ll provide that in the next section 60 • Image URL for Reference Sheet/Artist Credit for Reference Sheet x 4 – The name explains it all. Same as the Official Image info. • Click save when you add everything you want and there you go, you have made your UTAU official!! So that’s all you need to know! Be sure to maintain your UTAU and respect others. This is a community where music is supposed to bring us together! Just follow the rules and do this to be happy and everything will work out. The next couple of pages will have the character template and some helpful links for UTAU. I hope this tutorial has helped and here’s to a Music Revolution! 61 Character Profile Template Western Name (Japanese: (Hiragana Name)- Eastern Name in English) NAME INTERPRETATION: Hiragana First Name (English) – Meaning of the name Hiragana Second Name (English) – Meaning of the name TYPE: What kind of Loid is it? MODEL: Their Model number is applicable. GENDER VOICE RELATED RANGE CHARACTERS AGE GENRE HOMEPAGE WEIGHT CHARACTER CREATOR ITEM HEIGHT VOICE PICTURE LINK LIST SOURCE BIRTHDAY LIKES MEDIA LIST RELEASE DATE DISLIKES SIGNATURE SONGS Personality: Supplemental Information Hair color: Headgear: Eye color: Headphones: Dress: Nationality/Race: Favorite phrase: 62 Useful Links Changing System Locale http://windows.microsoft.com/en-us/windows/change-systemlocale#1TC=windows-7 How to type in Japanese Hiragana http://www.yesjapan.com/video/pages/install-japanese-windows-7-vista.html How to create a UTAU https://www.youtube.com/watch?v=b4R73mmrlRs - Part 1 https://www.youtube.com/watch?v=qUUN-gpofbM - Part 2 https://www.youtube.com/watch?v=2pzAhP2tiLA - Part 3 https://www.youtube.com/watch?v=_1jerBrl91g - Part 4 https://www.youtube.com/watch?v=Rxiv7P2JY_Q - Part 5 https://www.youtube.com/watch?v=s_9zKrROTz4 - Part 6 http://www.vocaloidotaku.net/index.php?/topic/46143-reclist-source-andexplanations/ - Fantastic source of different Voicebanks and reclists! http://purutau.blogspot.co.uk/2011/01/how-to-addedit-flagsbre-etc-in-utau.html - More information on flags http://fav.me/d850zwc - CV Basic English Reclist (It’s recommended if you plan to make a full English voicebank then to record it as CVVC) http://fav.me/d7wlw40 - Reclist to record a Japanese CV voicebank with some English sounds http://visa-to-america.deviantart.com/art/How-to-create-an-UTAUloid203047455 - Explanation of Voicebanks as well as Oremo https://sites.google.com/site/cvvcenglishusts/reclists - CVVC reclists https://utaututorials.wordpress.com/utafaq/ - FAQS on UTAU 63 http://auraautumnus.deviantart.com/art/UTAU-MULTILINGUAL-CV-VCGUIDE-AND-RECLIST-435249586 - Multilingual CV VC reclists (experienced users recommended) http://fav.me/d7ehw7i - VCV reclist http://fav.me/d7ehuix - Blank VCV oto Tutorials about Mixing https://www.youtube.com/watch?v=HjHh9AE0Sn0 https://www.youtube.com/watch?v=yRFje8SgR_4 Both of these tutorials are excellent when it comes to mixing! These are the ones I used when learning about Mixing. http://utau.wiki/tutorials:equalization:a-mixing-tutorial - Very good explanation of Equalizers http://ytchorus.forumotion.com/t1687-the-beginner-s-guide-to-mixing-audacity - Wonderful and detailed tutorial on how to use Audacity for mixing Software http://utau2008.xrea.jp/index.html - UTAU software http://utau.wikia.com/wiki/UTAU_wiki:UTAU_GUI_Translation - English Patch http://audacity.sourceforge.net/ - Audacity http://lame.buanzo.org/ - Lame for Audacity UTAU Voicebank for best reference http://www.canon-voice.com/index.html - Namine Ritsu http://utau.wikia.com/wiki/Nami_Utaune - Utaune Nami’s CV Voicebank 64 https://www.youtube.com/watch?v=3D1rlp-zpgs - Utaune Nami’s VCV Voicebank http://ladyogien.wix.com/ogien-utau - Axis, Atlas and Kasai voicebanks; excellent references of thorough voicebanks UTAU User Guide http://utau.wikia.com/wiki/UTAU_User_Manual UTAU wiki http://utau.wikia.com/wiki/UTAU_wiki - Version 1 http://utau.wikidot.com/ - Version 2 How to make an UTAU sound better http://fav.me/d4u2xgr - How to make male UTAUs clearer 65