OK so I need to vent. Had a call yesterday that went sideways in the first four minutes and never really recovered. The client is a German pharmaceutical company. Annual general meeting. 340 attendees on the Zoom call, interpretation into Mandarin and Japanese. Seems straightforward enough. Their IT department ran the tech check on Monday, everything was green. I personally double-checked the interpreter workstations on Wednesday. All good. Then Thursday 9 AM rolls around and...
The first thing that went wrong was something nobody could have predicted. The CEO's home office internet connection dropped twice in the first ten minutes. Not the interpreters. Not the RSI platform. The CEO. The one person whose voice is the entire reason the meeting exists. His fiber optic provider had an outage in his neighborhood. He switched to a mobile hotspot and the audio quality went from broadcast quality to what I can only describe as "2007 YouTube."
This is the thing about RSI that nobody in the sales meetings wants to talk about: you can control the interpreter's setup. You can't control the speaker's setup. And if the speaker's audio is garbage, the interpreter's output is garbage, because the interpreter is working from garbage input. Our Mandarin interpreter later told me she missed two drug name references because the CEO's mobile connection compressed the audio just enough that the difference between two similar-sounding chemical names became ambiguous. These are exactly the terms you cannot get wrong in pharmaceutical interpretation. She handled it by saying the drug name once and then immediately adding "the one indicated for this indication" — which is clever, but shouldn't be necessary.
The stuff that IS in your control (but people still mess up)
Alright. Venting over. Sort of. Let me talk about the things that I can actually help with, because they come up on every single engagement and I'm tired of having the same conversation.
Interpreter audio quality. I'm going to be very specific here because I've seen people take shortcuts that I genuinely don't understand. The interpreter needs a USB audio interface. Not the headphone jack on their laptop. Not a USB headset with a built-in sound card that does its own DSP processing. A proper audio interface. Focusrite Scarlett Solo is like $120. That's it. That's the fix for about 40% of the audio issues I see in RSI sessions. The laptop's built-in audio introduces its own latency and mixes input and output through the same pipeline, which creates echo and feedback that the RSI platform's echo cancellation sometimes handles and sometimes doesn't.
Headphones. Closed-back. Over-ear. This is not negotiable and I've had two separate clients this year try to use AirPods for a six-hour interpretation session. AirPods! For simultaneous interpretation! The thing about open-fit earbuds is that they leak sound. The interpreter is hearing the speaker and speaking at the same time. If their own voice leaks back into the microphone, it creates a feedback loop, and the echo cancellation algorithm starts doing weird things. The interpreter ends up hearing their own voice with a 200ms delay while they're trying to concentrate on the speaker. They told me afterward it felt like being haunted by themselves. Which is a great description and also a complete failure of the setup.
Network. I wrote a whole thing about this for a client's internal wiki and they still showed up to a 3-language meeting with interpreters on hotel Wi-Fi. Hotel Wi-Fi. For simultaneous interpretation. The packet loss was 6%. The audio quality was what I'd charitably describe as "functional" but honestly it was bad. The Japanese interpreter asked if we could switch to consecutive interpretation midway through. We couldn't, because the meeting format was designed for simultaneous. So she powered through, and did a remarkable job given the conditions, and I made a note in the post-session report that was basically a three-page passive-aggressive essay about bandwidth requirements.
The RSI platform question
Clients ask me which platform to use all the time. The honest answer is: it depends less on the platform than you'd think. The big ones (Interactio, KUDO, Voiceboxer, ZipDX) are all fine. They've all been around long enough to have solved the basic reliability problems. The differences are in integration — how cleanly they sit alongside Zoom or Teams — and in the terminology management features.
What's actually new this year is that two of these platforms now have AI-assisted terminology lookup that works reasonably well. The interpreter speaks, and a sidebar shows likely translations for the terms it recognizes. It's pulling from a glossary that the client uploads beforehand. In my experience it catches maybe 60–70% of the domain-specific terms correctly, which means the interpreter still needs to manually verify the other 30–40%. But the time savings are real. One of my Mandarin interpreters told me it probably saved her 15–20 minutes of active lookup time over a two-hour session, which for simultaneous interpretation is actually significant because every second of lookup time is a second of attention taken away from listening.
The thing that hasn't changed: the RSI platform runs alongside the video meeting, not inside it. I know Zoom has built-in interpretation. I've used it. It works for casual bilingual meetings. It does not work for professional simultaneous interpretation with three languages and 340 attendees. The audio routing is different. The latency characteristics are different. The interpreter experience is different. If you're running a meeting where interpretation quality matters, use a dedicated RSI platform.
A thing about the tech check
Every RSI provider asks for a tech check 24–48 hours before the session. Every client agrees. About 30% of clients actually do it properly. The rest either skip it entirely or do it at 10 PM the night before when network conditions are completely different from the 9 AM meeting time.
This is important and I'm going to explain why in a way that actually makes sense: network performance varies by time of day. Your office network at 10 PM is empty. At 9 AM, 400 people are logging in, downloading updates, streaming things, whatever. The network conditions during the tech check should approximate the network conditions during the actual meeting. If you test at 10 PM and the meeting is at 9 AM, your test results are basically fictional. I've started putting a note in my pre-session emails that says "please run the test at the same time of day as the meeting" and about half the clients follow this instruction. The other half... well, that's why I keep a list of post-mortem documentation templates.
The one thing I'd change about how clients approach RSI
Brief the interpreters. Properly. Not "here's the glossary, good luck." The interpreters need context about the meeting: who's speaking, what the agenda is, what the sensitive topics are, what acronyms the company uses internally, which product names are trademarked and need to be left in English versus translated. This sounds obvious. It apparently isn't obvious, because I get pushback on this from clients who think it's too much effort.
I had a client last month — won't name them — who scheduled a four-hour board meeting with interpretation into three languages and sent the glossary 20 minutes before the start time. Twenty minutes. For a pharmaceutical board meeting with proprietary drug names and clinical trial references. The interpreters did their best. It was not enough. The post-session feedback from the non-English-speaking attendees was, diplomatically, "mixed." The client blamed the platform. The platform was fine. The interpreters were excellent. The problem was that nobody briefed them. But nobody wants to hear that the failure was in preparation, not technology, because preparation is boring and technology is exciting and it's easier to blame the software.
Anyway. The pharmaceutical company from yesterday — we're doing a follow-up session next week. The CEO is getting a wired Ethernet connection installed at his home office. The interpreters are getting three days of prep material instead of 20 minutes. And I'm probably going to write another passive-aggressive memo about bandwidth requirements, because apparently once is not enough.
Artlangs Translation runs RSI operations across 230+ languages. The technology matters. The interpreter briefing matters more. We handle both, because doing one without the other is like buying a race car and putting regular gas in it.
