蒂姆Siglin: 欢迎回到2016年流媒体西部的Almost Live. 和我一起的还有Heather Hurford,她是LinkedIn的直播视频制作人. 希瑟,谢谢你加入我们.

希瑟Hurford: 谢谢你,蒂姆.

蒂姆Siglin: 我理解你想要解决的一个挑战, 这对我们所有人来说都是一个巨大的挑战, 现场字幕不公开. 有很多不同的部分. 你是如何分解问题并寻找解决方案的?

希瑟Hurford: 这真的是出于需要, a need at LinkedIn to add the captioning to our own internal ambiance needs in an effort to make our work environment more accessible and inclusive. 我有一段历史, 语言背景, in IT and production and so it was a great project for me to take on and I'd actually worked on adding closed captioning to a nationwide television channel back in the early 2000s when the FCC mandate came out to add caption to all broadcasts, 所以我有一些经验. 我举起了手,结果发现比我想象的更有挑战性. I think the hardest thing right now is that this standard that exists in the broadcast world has not been adopted for the online world and the reasons behind that, 就像我发现的那样并不是一成不变的.

蒂姆Siglin: 我回到这个行业已经18年了. 我记得SmileFiles是Real实现的一部分, 然后在Windows媒体上有了其他的技术解决方案. 我们有格式问题. 这是一个问题. What are the other issues that you're finding as to why things weren't adopted or why there's no standard per se, 上下文, 等.?

希瑟Hurford: 直播与视频点播的字幕不同, 所以我非常关注直播,因为我认为这是点播的, 实际上有一些相当不错的解决方案. 有很多不同的文件格式可以使用. 大多数平台都支持一种以上的文件格式,所以你有很多选择. 当谈到直播时,许多玩家实际上并不支持真正的封闭字幕.

蒂姆Siglin: So you're saying the on-demand players from a particular video platform might support the time tags but the live video player does not?

希瑟Hurford: 完全.

蒂姆Siglin: 好吧. 所以它不像以前的21天那样简单我们在21天里做了插入.

希瑟Hurford: No, and that's exactly the issue is that there's not really a standard that's been adopted across online media players, so YouTube had a solution for a while that had captioners going in and adding captioning 信息 on the back end, and it wasn't great for the kind of programming that we do at LinkedIn which is constant narration without a lot of breaks, 所以它真的很难跟上它. 我们会丢失大量的文本, 所以YouTube是最早支持708标准的公司之一,也就是数字——

蒂姆Siglin: 相当于第21行,对.

希瑟Hurford: 没错,在广播界. 这很好,因为所有的装备都在那里, 字幕编码器, that's the 信息 that they spit out so we added a closed captioning encoder to our signal flow so when we're live on YouTube that works great, 但许多其他平台不支持它. The internal player for example that we use doesn't have any way of taking that data and decoding it in the player. 本质上, 我们现在在内部做的是创建两个流, one with burned in 标题 and one without and we let the viewer decide which experience they want instead of having that nice toggle back and forth.

蒂姆Siglin: 来回切换. 好的,本质上, 你这是在做我们以前从电影转移过来时所说的时间代码燃烧, 所以实际上他们

希瑟Hurford: 这是公开的说明文字.

蒂姆Siglin: 哇. 这太疯狂了.

希瑟Hurford: 显然不是——

蒂姆Siglin: 这不是一个优雅的解决方案.

希瑟Hurford: 这不是一个优雅的解决方案. 就像我说的,这是暂时的变通办法, it's all platform dependent so we just did our biggest conferences to Ustream and Ustream supports the 708. 我们能够获得非常好的用户体验,封闭式字幕.

蒂姆Siglin: 法什的死对你的事业有什么影响, because one of the beauties of Flash is it inherently as a player had some time text capabilities in it. 显然,我们现在转向HTML5播放器. 你是否觉得HTML5玩家更优秀, 考虑到这一点的公司, 或者它仍然是一些支持它的人的大杂烩, 有些人不支持?

希瑟Hurford: 我在找一个大杂烩. YouTube和Ustream是我在直播领域发现的唯一支持它的大玩家. 其他人会说他们支持708,但他们实际上并不支持708标准, 他们提供了一个不同的解决方案,通常不是很好的用户体验. 我看过一些文字说明... 我用引号, 标题, 因为它实际上是在一个单独的窗口中弹出的滚动记录.

蒂姆Siglin: 窗口,对,没错.

希瑟Hurford: With most of those solutions you are obligated to use the captioning provider that they're partnered with and the quality, 它往往不是很好. 我总是喜欢指出,理解与准确性息息相关, 所以如果你和我在谈话,你只理解了我所说的70%, 这不是一个很好的谈话.

蒂姆Siglin: 这让我想到了一个有趣的问题. 在过去十年左右的时间里,我研究过一些语音到文本的解决方案, one of the ideas was we'll just plug a speech to text engine in there and have it do that and then you put that up for captioning, 但在现实中, 除非它受过训练, 你得到了65%的准确率. The other option is to have somebody sit there and type it and of course as we've all watched live news with those, 有些是语音上的, 等. 您可以稍后回来为随需应变的资产清理它, 但如何做到这一点的最佳解决方案是什么?你对此有何看法?

希瑟Hurford: 实际上有一个中间地带, which in the process of adding captioning at LinkedIn I discovered because it is really hard to train traditional transcriptionists to get beyond a certain accuracy level. The really exceptional ones that can get in the 90% range are few and far between and they're in high demand, 随着规模的扩大, 随着内容量的增加, 这是行不通的. 我发现在世界上的其他地方,人们也在使用声优, so a speech-to-text solution where there's still a human being who's taking in the content and re-speaking it.

蒂姆Siglin: 啊,读它. 重述它.

希瑟Hurford: 把它重新说成一个速度

蒂姆Siglin: 系统是经过训练的. 语音到文本的转换是为他们训练的, 这样他们就能听到,如果你使用多种语言,从翻译的角度来看,这也会有所帮助, 我想. 好吧.

希瑟Hurford: 是的,这样就解决了准确性的问题. 它保留了人的因素. 人工智能还没有出现, 所以你仍然需要一个人来做决定,并在关键时刻做出解释, 坦率地说.

蒂姆Siglin: 这是迷人的. 他们可能会用自己的语言来翻译某人

希瑟Hurford: 这就是我们正在做的.

蒂姆Siglin: 或者他们可能是在解读另一个人.

希瑟Hurford: 我们现在做的是全英文,我们看到的准确性是令人难以置信的,成本实际上是... 我不想用数字来表示, 但这比我在2002年花在现场字幕上的钱要便宜得多.

蒂姆Siglin: What we used to get upset with our mothers or grandmothers sitting next to us and telling everything that was being said on TV is now actually turned into something lucrative for a person who can do that.

希瑟Hurford: 它是. It's a skill and here's the interesting part when it comes to skill is that transcriptionists take several years to get trained up. 这是一种技能,实际上是老龄化的劳动力, 大多数这样做的人, 因此,他们也被称为重新说话者或声优,可以在短短几个月内完成训练.

蒂姆Siglin: 这些配音作者中有多少人拥有LinkedIn的个人资料?

希瑟Hurford: 你知道, 我还不知道,但就像我说的,在美国有一个真正的商机, 我知道他们在欧洲做什么... 所有内容都配有字幕和字幕,而且是用多种语言制作的. 有巨大的需求和规模.

蒂姆Siglin: 这是一种你不需要在房间里做的事情.

希瑟Hurford: 事实上,我们正在做的是使用带有iCap的EEG编码器,所以我们的字幕提供者是远程的. 他们用iCap云软件的密码拨号. 我们把节目音频发送到那里, 他们只是听到音频然后把字幕数据发送回去.

蒂姆Siglin: 考虑到它是一个流,你有好几秒的延迟, 他们收到了实时的音频信号, 就像刚开完电话会议, 意味着字幕会和视频同步.

希瑟Hurford: 好吧, EEG has; some of their encoders have a feature where you can introduce even more delay to close that gap.

这就是我们所做的. 我们实际上缩小了这个差距,在5秒内完成. Sometimes it actually varies and sometimes the 标题 will be right on and even lead by a second or two which is ... 我总是想知道观众是怎么想的.

蒂姆Siglin: 铅可以是一个奇怪的东西. 两秒钟后他会说. 嗯,希瑟,很有意思的谈话. 非常感谢.

希瑟Hurford: 谢谢你,蒂姆.

蒂姆Siglin: 再一次。, 我是LinkedIn的Heather Hurford, 直播视频制作人谈论直播视频字幕的挑战.

