Being a Timer is really the easiest job you can have on the staff, but it's probably the most annoying one too.
You don't need to know the language to time it - you just need to be able to grab the line when a character speaks, and stop it when they're finished talking. If the line seems too long, you try to find where you can make a clean break and do it then. Because all you're looking for are the lines, I don't necessarily need to know Japanese. Though not fluently, I happen to know the language, so its a plus for me to know where I can make my breaks in a long line.
You always start as close to the beginning of their line as possible, but a little bit of buffer at the end. The annoying problem is listening to the line over and over again to make sure you have it right. Second is when more than one character speaks at once, and third is if the line you're trying to grab has low volume, either because they're somewhere far away on the screen or if some kind of other noise is polluting their lines.
The program requires a wav of the audio and not the actual raw video itself. I need a set of headphones and listen in carefully to the lines.
The way TV-Nihon does it, Timing comes first. Using the program they supplied to me, I just look for their lines, and put placeholders where that line is going to be. The placeholder I use is a bunch of dots, mixing between "..." and "....." Doing it this way, when I look at the video, when a character speaks, I'll see those dots, and I'll know if I timed it correctly or not. Most of the time, they don't send me the actual video, but if they do, I test it on the video after I'm done. If not, I use a different video with the wav audio file along with the subtitles and just listen to the audio while I look on the screen for the timing.
Then there are the translators, who take the lines that I've grabbed, and turn it into english. They create a script out of it, then send it to the typesetters. The typesetters are the ones who get the script, create the font, choose the coloring, makes the effects in the font, and actually put the lines in the show. Usually, if it looks like a line I've grabbed doesn't time well with the translation given along with syncing to the video, they'll adjust the timing slightly to make it look better. Finally, it goes to the editor, who watches the whole thing, corrects the mistakes, and then finalizes the video.
There's a separate Karaoke typesetter, who does the same thing as normal typesetters except they do it strictly for the songs in the show. In this case, they do both the timing and the typesetting, though I believe the translators still hand them a script of the lyrics - otherwise, it might be possible they translate it themselves until the OST with the lyrics come out. This is probably the hardest job since it requires precise timing and ingenuity.
After I've mastered timing, I'm going to try to look into typesetting. It takes a little bit of visual creativity, especially since TV-Nihon is best known for using pretty colored fonts that match the sequences of events in the shows they do.
Once they release the projects I've done, I'll put them here and see how good (or bad) of a job I did as a Timer.
Edit: Actually, I have a video here that I've done as a tutorial that came with a script that I practiced timing on. It was an opening to an old anime, but I didn't do the karaoke since it didn't teach me how. I'll upload that when I get home so you can see it.