Live Transcription on Zoom for Ubuntu
As the pandemic unfortunately continues throughout the world and is now approaching two years old, the state of affairs has at least given many of us time to adjust to using video conferencing tools. The two that I use the most, by far, are Google Meet and Zoom.
I prefer using Google Meet, but using Zoom is unavoidable since it’s become the standard among my colleagues in academia. Zoom is likely used more widely than Google Meet because of access to China. (Strangely, though, I was recently on a Zoom call with someone I knew in Beijing, who told me he needed a Virtual Private Network (VPN) to use Zoom, so maybe I’m not fully understanding how VPNs work.)
The main reason why I continue using Google Meet is because of the quality of its live transcription. Just before the pandemic started, I remember getting on a virtual call with Google host Andy Zeng for what I call a “pre-interview interview.” (For research scientist internships at Google, typically a host will have already pre-selected an intern in advance.) Being from Google, Andy had naturally set up a Google Meet call, and I saw that there was a “CC” button and clicked on it. Then the live transcription started appearing at the bottom of our call, and you know, it was actually pretty darn good.
When the pandemic started, I don’t think Zoom supported this feature, which is why I asked to have Google Meet video calls for meetings with my involvement. It took a while, but Zoom was able to get live transcription working … but not for Ubuntu systems, until very recently. As of today (November 13, 2021) with Zoom version 5.8.3, I can launch a Zoom room on my Ubuntu 18.04 machine and enable the live transcription, and it works! For reference, I have been repeatedly trying to get live transcription on Ubuntu up until October 2021 without success.
This is a huge relief, but there are still several caveats. The biggest one is that the host must explicitly enable live transcription for participants, who can then choose to turn it on or off on their end. Since I have had to ask Zoom hosts to repeatedly enable live transcription so that I could use it, I wrote up a short document on how to do this, and I put this link near the top of my new academic website.
I don’t quite understand why this feature exists. I can see why it makes sense to have the host enable captioning if it comes from a third party software or a professional captioner, since there could be some security reasons there. But I am not sure why Zoom’s built-in live transcription requires the host to enable. This seems like an unusual hassle.
Two other downsides of the live transcription of Zoom, compared to Google Meet, is that (empirically) I don’t think the transcription quality is that good, and the captions for Zoom will only expand a short width on the screen, whereas with Google there’s more text on the screen. The former seems to be a limitation with software, and Google might have an edge there due to their humongous expertise in AI and NLP, but the latter seems to be an API issue which seems like it should be easy to resolve. Oh well.
I’m happy that Zoom seems to have integrated live transcription support for Ubuntu systems. For now I still prefer Google Meet but it makes the Zoom experience somewhat more usable. Happy Zoom-ing!