Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Send ID3 tag even when it has malformed content #408

Merged
merged 3 commits into from
May 26, 2022
Merged

Send ID3 tag even when it has malformed content #408

merged 3 commits into from
May 26, 2022

Conversation

pszemus
Copy link
Contributor

@pszemus pszemus commented Feb 18, 2022

Enables sending ID3 tag with as many frames as could be parsed.

Without this change this.trigger('data', tag) wasn't called if stumbled upon malformed ID3 tag/frame.

Beneficiaries: Tellos Z/IP Stream encoder produces ID3 data with TIT2, WOAF and TENC frames, from which only TIT2 can be parsed by mux.js, so let's take those frames that have been parsed.

Additionally, I'll take a look at mux.js' ID3 parser and try to find out if that metadata is really malformed or it's an issue with the parser.

Enables sending ID3 tag with as many frames as could be parsed.
@pszemus
Copy link
Contributor Author

pszemus commented Mar 31, 2022

can anyone pick up this PR?

@joeyparrish
Copy link
Contributor

Ping! Can anyone look at this PR? We'd love to have this fix for shaka-project/shaka-player#3761

Copy link
Contributor

@gesinger gesinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR!

I think this change makes sense as a more lenient approach to malformed content, but it may be worth looking at the encoder and content themselves.

If this is the line that breaks the sample content, then the sample content appears to not have a valid frame header, as the size should be the size of the frame (which should be greater than 0).

Judging from

if (tagParsers[frame.id]) {
and
tagParsers = {
TXXX: function(tag) {
var i;
if (tag.data[0] !== 3) {
// ignore frames with unrecognized character encodings
return;
}
for (i = 1; i < tag.data.length; i++) {
if (tag.data[i] === 0) {
// parse the text fields
tag.description = parseUtf8(tag.data, 1, i);
// do not include the null terminator in the tag value
tag.value = parseUtf8(tag.data, i + 1, tag.data.length).replace(/\0*$/, '');
break;
}
}
tag.data = tag.value;
},
WXXX: function(tag) {
var i;
if (tag.data[0] !== 3) {
// ignore frames with unrecognized character encodings
return;
}
for (i = 1; i < tag.data.length; i++) {
if (tag.data[i] === 0) {
// parse the description and URL fields
tag.description = parseUtf8(tag.data, 1, i);
tag.url = parseUtf8(tag.data, i + 1, tag.data.length);
break;
}
}
},
PRIV: function(tag) {
var i;
for (i = 0; i < tag.data.length; i++) {
if (tag.data[i] === 0) {
// parse the description and URL fields
tag.owner = parseIso88591(tag.data, 0, i);
break;
}
}
tag.privateData = tag.data.subarray(i + 1);
tag.data = tag.privateData;
}
},
it seems that mux.js supports TXXX, WXXX, and PRIV, and will break out of frame parsing for any other types.

Did you have a specific segment that's breaking on this line? I'd be interested to take a look, and if you have any other details on the ID3 tags that might help.

It would also be worth us adding a test, if you don't mind writing one up.

Thank you again!

lib/m2ts/metadata-stream.js Outdated Show resolved Hide resolved
lib/m2ts/metadata-stream.js Show resolved Hide resolved
@pszemus
Copy link
Contributor Author

pszemus commented May 26, 2022

it may be worth looking at the encoder and content themselves

After further investigation it turns out that, if the encoder is missing the audio URL, it creates an ID3 WOAF frame with size of 0B, which is not allowed according to ID3 documentation. Encoder should insert an ID3 frame with empty string or abort inserting such a frame. I've managed to report that case to the developer.

Nevertheless I'd recommend merging this PR.

it seems that mux.js supports TXXX, WXXX, and PRIV, and will break out of frame parsing for any other types.

mux.js won't decode the information sent in text frames other than TXXX (e.g. TIT2 the encoder is using for holding the song title), but it passes the raw binary data to the client (frame.data)

data: tag.data.subarray(frameStart + 10, frameStart + frameSize + 10)

so I'm able to decode it myself.

I'll be willing to spend some more time and try to implement decoding of text content for all text frames (T*, e.g. TIT2).

Did you have a specific segment that's breaking on this line? I'd be interested to take a look

I have a 5sec .ts file with malformed ID3 frame in a way described in the beginning of this commend. I can send it to you if you're interested.

It would also be worth us adding a test, if you don't mind writing one up.

I'll try my best and get back to you soon.

validFrame = id3Frame('TIT2',
0x03, // utf-8
stringToCString('sample title')),
malformedFrame = id3Frame('WOAF'), // frame with size of 0B
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could also create the same malformed frame like this:

stringToInts('WOAF').concat([
        0x00, 0x00, 0x00, 0x00, // size
        0x00, 0x00 // flags
      ]),

to make the malformation more visible

@gesinger
Copy link
Contributor

Thank you again @pszemus , I think you're right that we should add this despite it being a content issue. If encoders take the approach of including malformed ID3 frames it still makes sense to pass back valid ones already parsed.

I'll be willing to spend some more time and try to implement decoding of text content for all text frames (T*, e.g. TIT2).

Any contributions are very much appreciated if you have the time!

I have a 5sec .ts file with malformed ID3 frame in a way described in the beginning of this commend. I can send it to you if you're interested.

If the content is not public enough to share here, and you can share it via another medium (e.g., the Video.js Slack), let me know. It would be good for testing.

And thank you for adding the test!

@pszemus
Copy link
Contributor Author

pszemus commented May 26, 2022

This is the .ts file with malformed WOAF frame (0B content size): seg64000-00031954.ts.zip

If you're happy with this PR you can merge it. Changes regarding parsing other frames will be sent in another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants