-
Notifications
You must be signed in to change notification settings - Fork 211
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add multibyte character support #398
Conversation
4c9f4b8
to
091ad78
Compare
Co-authored-by: Gary Katsevman <git@gkatsev.com>
Co-authored-by: Gary Katsevman <git@gkatsev.com>
…x.js into multibyte-708-support
}); | ||
} else { | ||
try { | ||
this.textDecoder_ = new TextDecoder(encoding); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to throw if the encoding was incorrect. We may want to concat the caught error onto our warning below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use a test for:
- passing the encoding around
- Multi-byte character examples (if we can pull generic words out of our sample for use)
- Not having a TextDecoder
- An invalid encoding
// Use the TextDecoder if one was created for this service | ||
if (service.textDecoder_ && !isExtended) { | ||
if (isMultiByte) { | ||
charCodeArray = [currentByte, nextByte]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should subarray here instead to prevent the creation of a new Uint8Array below.
// for multi-byte (can they be more than two characters?)
charCodeArray = packetData.subarray(i, i + 1)
// for single
charCodeArray = packetData.subarray(i)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@brandonocasey Actually, packetData
isn't a typed array so we do need to create a Uint8Array here. And to address your other question, from the spec:
When a decoder encounters the P16 code, it uses the succeeding two bytes to
address characters in a 16-bit code set.
Co-authored-by: Gary Katsevman <git@gkatsev.com>
…x.js into multibyte-708-support
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved pending the tests pass
test/caption-stream.test.js
Outdated
QUnit.test('Creates TextDecoder only if valid encoding value is provided', function(assert) { | ||
var secondCea708Stream; | ||
|
||
cea708Stream = new m2ts.Cea708Stream({ | ||
captionServices: { | ||
SERVICE1: { | ||
encoding: 'euc-kr' | ||
} | ||
} | ||
}); | ||
|
||
cc708Korean.forEach(cea708Stream.push, cea708Stream); | ||
cea708Stream.flushDisplayed(4721138662, cea708Stream.services[1]); | ||
|
||
assert.ok(cea708Stream.services[1].textDecoder_, 'TextDecoder created successfully when encoding is valid'); | ||
|
||
secondCea708Stream = new m2ts.Cea708Stream({ | ||
captionServices: { | ||
SERVICE1: { | ||
encoding: 'invalid' | ||
} | ||
} | ||
}); | ||
|
||
cc708Korean.forEach(secondCea708Stream.push, secondCea708Stream); | ||
secondCea708Stream.flushDisplayed(4721138662, secondCea708Stream.services[1]); | ||
|
||
assert.notOk(secondCea708Stream.services[1].textDecoder_, 'TextDecoder not created when encoding is invalid'); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this test needs to check for IE11 or if TextDecoder is present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Verified locally. Can see captions displayed properly instead of random characters!
Description
This adds multibyte character support to the 708 implementation via a new
encoding
option that can be passed to theCaptionStream
constructor. The intent is to permit legacy multi-byte encodings for Chinese, Japanese, Korean, and others. The value of the option is used to create aTextDecoder
instance (assuming the current browser supports it, only IE doesn't), so the value can be any "valid [encoding] label".I'm waiting to commit the test I wrote until I have an open-source content sample to use