Discussion:
[gst-devel] Common Opensource codec API , was : Re: the road to 1.0 and beyond
ChristianHJW
2003-06-26 20:59:14 UTC
Permalink
hallo mike,
this sounds like exactly the right direction for future developements to me.
- does ffmpeg's codec api have all features needed, e.g. stuff like
cache-local colorspace conversion (per-slice) like libmpeg2 has?
- does ffmpeg have a plugin concept for dynamic loading of plugins?
if not, do you see a chance to talk them into introducing such a
feature?
- how about demuxer plugins? could this api be shared as well?
how about other project? does anybody on this list have contacts and/or
deeper knowledge about other multimedia projects in the free software
world? i'm thinking of mplayer g2, videolan and gstreamer here.
Ronald 'BBB' Bultje, one of the core devs of the Gstreamer team is
reading here AFAIK, so he could make contact i guess.

The Gstreamer people do have a nice API with their 'gst plugin API'
AFAIK, but it doesnt help them a lot as its tied to Gstreamer as a
framework and so no codec developers will support it natively. So, in
the end, they have to make all the plugins themselves now ( i guess BBB
is doing that ) and also maintain them.

If there was a common opensource codec API, a standard wrapper plugin
could be enought for them maybe to cover most formats ?

I was talking with Erik 'omega' Waltinsen a couple of times on IRC, he's
the founder of Gstreamer and does certainly know a lot about this kind
of stuff, and he told me he had a neat concept in the very back of his
brains already, but just no time to bring it onto paper. Maybe if you
guys showed some real interest in cooperating with them here, you could
start an avalanche ....
my basic reasoning behin this is that now that all these different
projects are getting close to their 1.0 releases (ok, not all of them ;)
) and so many developers have gained experience on how to design media
players, codecs, do audio/video output, etc. it would be interesting to
see if and how some collaboration between these projects could be
established.
guenter
Exactly my thoughts also. Must be because we're both Germans Guenter :-D !!

Christian
Guenter Bartsch
2003-06-26 21:49:03 UTC
Permalink
hallo christian,

one the one hand i am happy the subject has changed as this discussion
was getting pretty off-topic, on the other hand i dislike two things
about the new topic:

- opensource
while i dislike the word in itself (i never really understood what
the opensource movement is about), i think it limits this discussion
to just opensource codecs - but binary-only codecs play an important
role in today's free software world, like it or not (think quicktime,
think real codecs for examples)

- codecs
i'd like to broaden the discussion to all kinds of plugins, especially
demuxers and a common input abstraction layer

so, basically i think it would be interesting to see if it is possible
to agree on a common standard for free multimedia plugins, especially
for

- a common input abstraction layer
- demuxers
- codecs

and maybe also

- audio/video output

among some media player projects.
Post by ChristianHJW
Ronald 'BBB' Bultje, one of the core devs of the Gstreamer team is
reading here AFAIK, so he could make contact i guess.
The Gstreamer people do have a nice API with their 'gst plugin API'
AFAIK, but it doesnt help them a lot as its tied to Gstreamer as a
framework and so no codec developers will support it natively. So, in
the end, they have to make all the plugins themselves now ( i guess BBB
is doing that ) and also maintain them.
is there some documentation available on gstreamer's plugin api? sounds
very interesting to me, but what i found on their website was pretty
incomplete and hat lots of broken links in it. gstreamer is definitely
worth looking into, though.

i have also added mplayer's developement list to the recipients of this
mail as i think their mplayer g2 efforts might be a good time to think
about common standards as well.
Post by ChristianHJW
I was talking with Erik 'omega' Waltinsen a couple of times on IRC, he's
the founder of Gstreamer and does certainly know a lot about this kind
of stuff, and he told me he had a neat concept in the very back of his
brains already, but just no time to bring it onto paper. Maybe if you
guys showed some real interest in cooperating with them here, you could
start an avalanche ....
let's see, what he says :) one problem with gstreamer imho is their many
g'isms, not sure what the current state there is. i think a common api
should not be dependant on stuff like glib or gobject (though i love
glib i don't think it's a good idea to force everyone into using it).

so, let's hear what people have to say.

this is the time for constructive proposals, all flames and all mplayer
vs xine vs gstreamer vs whatever ranting will go directly into
/dev/null.

guenter
--
"The other day I put instant coffee in my microwave oven ... I almost
went back in time."
-- Steven Wright
Ronald Bultje
2003-06-27 11:31:17 UTC
Permalink
Howdy,
Post by Guenter Bartsch
so, basically i think it would be interesting to see if it is possible
to agree on a common standard for free multimedia plugins, especially
for
- a common input abstraction layer
- demuxers
- codecs
and maybe also
- audio/video output
among some media player projects.
I'd be in favour of some of these. There's some buts here...

* if we take ffmpeg's demuxer (application) interface as an example, we
see that it is severly limited. I don't see any way to specify subtitle
streams, nor can I get private streams. I'm limited to video/audio,
which isn't a good thing. Of course this is fixable, but in the best
case, it'd require some sort of objectification, and as far as I
understand, you guys aren't really in favour of this. There's three ways
to do this:
* c++
* g_* stuffies
* c struct with casts
Basically, (2) is (3) with some nicities of separation of classes and
instances around it. Anyway, most people will dislike both (1) and (2)
simply because it has dependencies (c++/glib). (3) looks evil and is
quite some work to implement, but would be worth a try. What you
probably want is - taking the demuxer as an example - to have a
bytestream object which exports codecstreams (parent object) which can
be a video/audio/subtitle/teletext/private/any stream (a child of the
codecstream parent).

Fortunately, for codecs, the idea is much simpler, but the same problem
applies to properties: each codec must be able to generate *any*
property with *any* value type. This is my main problem with ffmpeg
currently (I do want to mention that ffmpeg totally rocks, but just like
anything, it isn't perfect. ;) ), it's just one static struct. This is
nice'n'simple, but also severely limiting. This is why using ffmpeg's
current form of identifying codecs/streams/etc. wouldn't be a good idea
to use as the basis of such a generic codec API, imo.

GStreamer isn't perfect either. It depends on glib - I can understand
that people won't like that. Same goes for other interfaces. It's
basically fairly complex for a thing like a codec interface. It's fairly
generalized towards basically anything, which makes it less suited as
example for specific purposes. Basically, if I want a codec API or a
demuxer API, I probably want these to be specifically suited for that
purpose. For GStreamer, both APIs would be the same. ;). What I'm saying
is that if we want to make a good codec API, some ideas of GStreamer
(extendibility etc.) might be worth considering, but in general,
GStreamer's API an sich might be a bit too general. I'm not sure what
others think of this. The good thing is that you're not constrained by a
limited set of properties, streams, types of streams or anything;
everything is extendible without having to modify any of the GStreamer
core.

As for Xine/Mplayer, I'm sorry to say that I don't have enough
experience with their code to give a strong opinion on it. I did look at
the code a few times, but don't know the codebase well enough. I'll try
to make some general comments, though, regarding the codec API of both.

I've had a quick look at the mplayer demuxer code (simply because I took
that as an example for the ffmpeg one too), and noticed that it has
entries for audio, video and subtitles in the demuxer_t struct. That
leads to the same complaint as for ffmpeg - it's pretty good, but it is
somewhat limiting. Another problem (well, take these as comments, not
complaints) is that the actual filestream data (which should be private
in the demuxer, not adaptable by the application) is integrated in the
demuxer_t struct, too. This isn't a problem, but it's not a good idea to
make this part of the codec/demuxer API - it should only contain things
that the application actually cares about. The rest should be kept
private.

As for Xine, I read through it quickly but apparently, I don't get it.
;). xine-lib/src/demuxers/demux.h doesn't mention anything apart from
video. I guess I'm missing something. ;).
Post by Guenter Bartsch
is there some documentation available on gstreamer's plugin api? sounds
very interesting to me, but what i found on their website was pretty
incomplete and hat lots of broken links in it. gstreamer is definitely
worth looking into, though.
Well, there used to be some, but I can't find it anywhere, that's pretty
much a bad thing. In our CVS, there's a gst-template module. In the
directory gst-plugin/src/*, you'll find an example plugin. That is
probably a good start. Direct WWW link:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/gstreamer/gst-template/gst-plugin/src/

At the bottom, plugin_desc is the structure that loads the plugin. The
_class_init(), _init() and _get_type() functions are all gobject core
stuff. _get_property() and _set_property() (for properties) are all
glib/gobject property functions. _chain() is the actual datastream
callback for I/O. At the top, the pad templates define I/O points for
this plugin, and a caps (none given here) defines the type of the
stream. A GstCaps is basically what we use as way of identifying data
types. Some more info on this is in CVS, module gstreamer:
docs/random/mimetypes. www link:
http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/gstreamer/gstreamer/docs/random/mimetypes. We're currenty basically done documenting all this, but it's not all implemented perfectly yet.
Post by Guenter Bartsch
let's see, what he says :) one problem with gstreamer imho is their many
g'isms, not sure what the current state there is. i think a common api
should not be dependant on stuff like glib or gobject (though i love
glib i don't think it's a good idea to force everyone into using it).
As said above, I tend to agree that it's not a good dependency for a
core thing such as a codec API.

What I'd love to see is the codec API not being an actual lib, but just
a protocol, like X. Each application can then write it's own
implementation code for this, and the codec API can, if wanted, provide
a header file describing all structs (if any) etc. used in the protocol.
Tying ourselves to one and the same lib likely won't work, even now
we're already reproducing so much of the same code... Do people agree
with this?

Ok, now back to the actual question, what would it look like. A protocol
describes structs or data fields, documents what properties mean what,
etc. If we want a common demuxer API, I'd very much prefer if people
would use the parent-child object stuffing I described in the beginning
of this email. Each bytestream can export codec streams, which can be
anything, including but not limited to audio, video, subtitle, teletext,
private data or whatever. Types define the actual type of the codec
stream, and by means of a child class, all specifics are filled in for
the particular stream.
For audio, this is rate, channels, maybe (in case of raw audio) the
bitsize (8, 16) or container type (uint8_t, int16_t, maybe expansions of
these for future reference), this can also be float data, btw! We just
need to document what property name means what, and then simply make a
name/value pair system that describes all these.
For video, this would be width, height, ... For subtitles, this would be
nothing at first, I guess.
The parent object describes things like timestamps, duration, the actual
data+size.

Codecs: much easier. One input, one output. Properties are the same as
the properties for the codec stream in the demuxer in case of the
encoded data. For the decoded data, the same applies, but we'll probably
want to provide some more properties for 'raw' data. Anyway, this just
needs specifying. The mimetypes document above is what we currently use,
other comments are welcome, of course.

Concerns from my side, for as far as my experience goes with codecs in
GStreamer: do we want to separate between codec and bytestream in case
where these are (almost) the same, such as ogg/vorbis, mp3, etc? If so,
what are the exact tasks of the parser (e.g. defining properties
required by the codec, maybe metadata) and the codec (decoding), and
what do we do if these interfere (e.g. I'm being told that not ogg, but
vorbis contains the metadata of a stream!).

And just to state clearly: our final goal is to propose a standardized
API or interface of how codecs, muxer/demuxer libraries etc. should look
to be usable by our applications. It is not to define how a bytestream
should look. ;). Just so I (and you) know what we're actually talking
about.

Enough typing for now, I'd better get back to actual work here. :-o.
Comments are very much appreciated. ;).

Thanks for reading,

Ronald
--
Ronald Bultje <***@ronald.bitfreak.net>
Christian Fredrik Kalager Schaller
2003-06-27 12:37:23 UTC
Permalink
Post by Ronald Bultje
There's three ways
* c++
* g_* stuffies
* c struct with casts
Basically, (2) is (3) with some nicities of separation of classes and
instances around it. Anyway, most people will dislike both (1) and (2)
simply because it has dependencies (c++/glib).
Considering how widely available and widely used glib is I have a hard
time seeing that it would be something 'most' people would dislike. I
mean, whats next, people objecting to a libc dependency?

Christian
--
Christian Fredrik Kalager Schaller <***@linuxrising.org>
ChristianHJW
2003-06-28 12:37:15 UTC
Permalink
Post by Ronald Bultje
Howdy,
Post by Guenter Bartsch
so, basically i think it would be interesting to see if it is possible
to agree on a common standard for free multimedia plugins, especially
for
I 'd have high hopes on this also, but i am not convinced that it is
possible to find a common denominator easily.
Post by Ronald Bultje
Post by Guenter Bartsch
let's see, what he says :) one problem with gstreamer imho is their many
g'isms, not sure what the current state there is. i think a common api
should not be dependant on stuff like glib or gobject (though i love
glib i don't think it's a good idea to force everyone into using it).
As said above, I tend to agree that it's not a good dependency for a
core thing such as a codec API.
I fully agree here BBB. This codec API can maybe base on a small lib,
but thats it. Alex Stewarts' libuci was more an example lib how UCI
could be used from an app, while on the codec side this wasnt required
Post by Ronald Bultje
What I'd love to see is the codec API not being an actual lib, but just
a protocol, like X. Each application can then write it's own
implementation code for this, and the codec API can, if wanted, provide
a header file describing all structs (if any) etc. used in the protocol.
Tying ourselves to one and the same lib likely won't work, even now
we're already reproducing so much of the same code... Do people agree
with this?
LOL. Maybe OT here, but this reminds me that somebody was comparing
EBML, the backbone of matroska and a kind of binary XML, with a printer
protocol for LAN printers once ... :) ...
Post by Ronald Bultje
Ok, now back to the actual question, what would it look like. A protocol
describes structs or data fields, documents what properties mean what,
etc. If we want a common demuxer API, I'd very much prefer if people
would use the parent-child object stuffing I described in the beginning
of this email. Each bytestream can export codec streams, which can be
anything, including but not limited to audio, video, subtitle, teletext,
private data or whatever. Types define the actual type of the codec
stream, and by means of a child class, all specifics are filled in for
the particular stream.
For audio, this is rate, channels, maybe (in case of raw audio) the
bitsize (8, 16) or container type (uint8_t, int16_t, maybe expansions of
these for future reference), this can also be float data, btw! We just
need to document what property name means what, and then simply make a
name/value pair system that describes all these.
For video, this would be width, height, ... For subtitles, this would be
nothing at first, I guess.
The parent object describes things like timestamps, duration, the actual
data+size.
Codecs: much easier. One input, one output. Properties are the same as
the properties for the codec stream in the demuxer in case of the
encoded data. For the decoded data, the same applies, but we'll probably
want to provide some more properties for 'raw' data. Anyway, this just
needs specifying. The mimetypes document above is what we currently use,
other comments are welcome, of course.
Concerns from my side, for as far as my experience goes with codecs in
GStreamer: do we want to separate between codec and bytestream in case
where these are (almost) the same, such as ogg/vorbis, mp3, etc? If so,
what are the exact tasks of the parser (e.g. defining properties
required by the codec, maybe metadata) and the codec (decoding), and
what do we do if these interfere (e.g. I'm being told that not ogg, but
vorbis contains the metadata of a stream!).
BBB, can you invest a couple of hours and come up with a small doc
describing such a protocol, so it could be discussed on the lists that
have been involved and were expressing interest in such a solution ?
Post by Ronald Bultje
And just to state clearly: our final goal is to propose a standardized
API or interface of how codecs, muxer/demuxer libraries etc. should look
to be usable by our applications. It is not to define how a bytestream
should look. ;). Just so I (and you) know what we're actually talking
about.
Alex had the following plans for UCI :

UCI : codec API
UFI : filter API
UMI : muxing API, so that various containers could be used from
supporting apps

All those required a different level of complexity, with UCI being
lowest. UCI itself could not handle streams with more than one
substreams in it ( like DV type 1 ), but this was supported by UMI then.
Very unfortunately not even UCI was getting anywhere close to completion
and we had no sign of Life from Alex since a couple of months now, but
maybe somebody find the time to look at what he did and documented so
far ? http://uci.sf.net . BBB, is Alex' approach similar to a protocol
approach as suggested by you ?
Post by Ronald Bultje
Enough typing for now, I'd better get back to actual work here. :-o.
Comments are very much appreciated. ;).
Thanks for reading, Ronald
Great ideas BBB, of course i only understand half of it ;-) ..... any
other devs care to comment more in detail ? Could such a 'protocol'
solution be something you would want to support ?

Christian
Attila Kinali
2003-12-29 17:52:17 UTC
Permalink
On Sat, 28 Jun 2003 16:35:45 +0200
Post by ChristianHJW
Post by Ronald Bultje
Post by Guenter Bartsch
so, basically i think it would be interesting to see if it is possible
to agree on a common standard for free multimedia plugins, especially
for
I 'd have high hopes on this also, but i am not convinced that it is
possible to find a common denominator easily.
Leaving all politcal reasons aside, i don't think that this is as
easy as you believe. I only know the code of mplayer and parts of vlc,
but sofar i've seen too many differences on how things are done,
that a common plugin standard could be acchieved w/o turning all
players into one using the same code base and ideas.
Beside imho it wouldn't be a good idea. The power of opensource
comes from diversity, if you restrict that by using a common plugin
format that restricts the whole player to use this api/abi everywhere
internaly, then you also restrict the diversity of code.
And i also dont think that a common plugin api would help much,
as less then 10% of the code of filters and output modules are
interface code, so porting code from one interface to another
(w/o optimizing to the new interface) is quite easy.
The only place where a common api makes imho sense are the codecs
as this code is quite complex and only a few people have enough
knowledge to handle that stuff properly. But there, we already have
a quite common api for codecs: libavcodec
Post by ChristianHJW
Post by Ronald Bultje
What I'd love to see is the codec API not being an actual lib, but just
a protocol, like X. Each application can then write it's own
implementation code for this, and the codec API can, if wanted, provide
a header file describing all structs (if any) etc. used in the protocol.
Tying ourselves to one and the same lib likely won't work, even now
we're already reproducing so much of the same code... Do people agree
with this?
That's imho a quite bad idea. If you have ever written anything
time critical with using X11 than you know how much time you loose
just for the protokol conversion. Nevertheless X11 is a great protokol
and it has enabled us to do a lot of things, but it was never meant
to be fast enough for real time applications (which movie player are
in a certain way) and thus a few extensions were invented like DGA
and XVideo which allow a more or less direct hardware access.
If you want to write an api for movie players, than you have to
first realize that time is a critical resource and you have to do
anything to save it. You also have to see how different applications
try to get the data from the codec to the output modules and how
they optimize this process for speed. Any api that restricts the
use of such optimizations wouldn't be used much.

[a lot of blah about protocols and api]

I leave that stuff to people who understand much more
about codecs than i do :)
But that's definitly something that could be discussed
at the video application developer meeting i proposed in a
mail i send to a few lists earlier this week.

Just my 2c

Attila Kinali
--
egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger
-- reeler in +kaosu
Steve Lhomme
2003-12-29 17:52:26 UTC
Permalink
Post by Attila Kinali
Post by ChristianHJW
I 'd have high hopes on this also, but i am not convinced that it is
possible to find a common denominator easily.
Leaving all politcal reasons aside, i don't think that this is as
easy as you believe. I only know the code of mplayer and parts of vlc,
but sofar i've seen too many differences on how things are done,
that a common plugin standard could be acchieved w/o turning all
players into one using the same code base and ideas.
Beside imho it wouldn't be a good idea. The power of opensource
comes from diversity, if you restrict that by using a common plugin
format that restricts the whole player to use this api/abi everywhere
Well, that's one side of opensource and its diversity. What are the main
applications where opensource works ? Linux and Apache. Linux uses all
the ideas that come from UNIX and is compliant to UNIX standards so that
it can work. Apache is compliant with HTTP and HTTP only otherwise it
would be useless... At one point if you want something to progress and
succeed you have to fix a range of possible and impossible things.
That's what needs to be discussed first. Then we'll see what projects
could agree on something.
Post by Attila Kinali
I leave that stuff to people who understand much more
about codecs than i do :)
But that's definitly something that could be discussed
at the video application developer meeting i proposed in a
mail i send to a few lists earlier this week.
When and where is it ? Are Matroska ppl invited ?
Attila Kinali
2003-12-29 17:52:33 UTC
Permalink
On Thu, 25 Dec 2003 12:13:37 +0100
Post by Steve Lhomme
Well, that's one side of opensource and its diversity. What are the main
applications where opensource works ? Linux and Apache. Linux uses all
the ideas that come from UNIX and is compliant to UNIX standards so that
it can work. Apache is compliant with HTTP and HTTP only otherwise it
would be useless... At one point if you want something to progress and
succeed you have to fix a range of possible and impossible things.
That's what needs to be discussed first. Then we'll see what projects
could agree on something.
I think that OSS development works for mplayer, xine, vlc and all
the video apps too. They are just not as big projects as linux
or apache. IMHO the only interfaces which need to be defined and
to be common are those which cannot be changed or are interfaces
to external and/or remote programs. Thus linux has to provide
a posix/bsd compliant api and apache needs a http stack.
For movie players, those fixed interfaces are the file readers
and the interfaces to the output devices. Everything inbetween
can be changed.
Yes, i would like some standardisation to, especialy at the
codecs and video/audio output devices, but on the other hand
i know that it'll be quite hard for the different projects to
agree on one common numinator.
But, you (meaning all projects) could still surprise me :)
Post by Steve Lhomme
Post by Attila Kinali
I leave that stuff to people who understand much more
about codecs than i do :)
But that's definitly something that could be discussed
at the video application developer meeting i proposed in a
mail i send to a few lists earlier this week.
When and where is it ? Are Matroska ppl invited ?
Yes, they are invited, like everyone else who is intersted
in devlopment of video realted programs/library.
See http://mplayerhq.hu/pipermail/mplayer-dev-eng/2003-December/022920.html
Yes, i know i forgot quite a few projects, but the official announcment
will follow somwhen in mid january when the webside and the call for papers
is ready. And i'll send it to more projects too.


Greetings
Attila Kinali
--
egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger
-- reeler in +kaosu
Miguel Freitas
2003-12-29 17:52:38 UTC
Permalink
Hi,
Post by Attila Kinali
On Sat, 28 Jun 2003 16:35:45 +0200
Post by ChristianHJW
Post by Ronald Bultje
Post by Guenter Bartsch
so, basically i think it would be interesting to see if it is possible
to agree on a common standard for free multimedia plugins, especially
for
I 'd have high hopes on this also, but i am not convinced that it is
possible to find a common denominator easily.
Leaving all politcal reasons aside, i don't think that this is as
easy as you believe. I only know the code of mplayer and parts of vlc,
but sofar i've seen too many differences on how things are done,
that a common plugin standard could be acchieved w/o turning all
players into one using the same code base and ideas.
Beside imho it wouldn't be a good idea. The power of opensource
comes from diversity, if you restrict that by using a common plugin
format that restricts the whole player to use this api/abi everywhere
internaly, then you also restrict the diversity of code.
I agree with Attila. The problem for finding this common denominator is
that all those projects have some different paradigms and its unlikely
you would be able to convince another project to follow your idea (maybe
changing their codebase heavily). Some only use C, some prefer C++, some
have a monolithic architecture, some use plugins and shared libs, some
use a protocol instead of an API...

Last time this discussion emerged i felt skeptical about the success of
such external standardization. I mean, suppose xine, mplayer and
matroska teams agree on a grand unified api. it would be very arrogant
to imagine that any other project would just follow (like if we were the
'leaders' or something). "hey
{faad|mad|libmpeg2|liba52|flac|ffmpeg|ogg|vorbis|vlc|etc|etc} developer,
please abandon your own api because we just created a new one for you!"
Post by Attila Kinali
And i also dont think that a common plugin api would help much,
as less then 10% of the code of filters and output modules are
interface code, so porting code from one interface to another
(w/o optimizing to the new interface) is quite easy.
Free software development must be about freedom to exercise your
programming creativity. When somebody creates a plugin for something
(codec, effects, muxer, video output, whatever) he will just do the way
he feels right. If he wants to create a plugin for xmms or xine or
mplayer just because he like those better, good for him. My work, as a
xine developer, is to later integrate his plugin to my own architecture.
of course, if the guy wants to make the plugin more popular he may just
write those remaining 10% of code and port it our player.
Post by Attila Kinali
The only place where a common api makes imho sense are the codecs
as this code is quite complex and only a few people have enough
knowledge to handle that stuff properly. But there, we already have
a quite common api for codecs: libavcodec
Yep. and that is only becoming a common api because it is a de facto
standard. imho libavcodec has exceptionally good programmers and is
heading to support about everything. so they don't push their api into
anybody's throat, they just code.
Post by Attila Kinali
Post by ChristianHJW
Post by Ronald Bultje
What I'd love to see is the codec API not being an actual lib, but just
a protocol, like X.
Luckily you are not the one writing codecs! ;-)

I'm not a codec writer either, so i leave this suggestion to the
experts. still, it doesn't sound like a good idea to me.

regards,

Miguel
Tuukka Toivonen
2003-12-29 18:23:03 UTC
Permalink
Post by Attila Kinali
Post by Ronald Bultje
Post by Guenter Bartsch
so, basically i think it would be interesting to see if it is possible
to agree on a common standard for free multimedia plugins, especially
Leaving all politcal reasons aside, i don't think that this is as
easy as you believe.
Certainly it is not easy. Or this thread wouldn't be necessary.
Post by Attila Kinali
Beside imho it wouldn't be a good idea. The power of opensource
comes from diversity, if you restrict that by using a common plugin
If this diversity means that things just don't work, I disagree. And I feel
that this would be the result, because there are devices that produce video
in their own formats. Before any application can do anything with
these devices, they must understand the format. This can be achieved by
either doing format conversion in the kernel--which isn't a good idea nor
even allowed by the master kernel hackers--or by a plugin system that is
adhered by all applications.

A plugin must be delivered along with a device driver, but it makes no
sense to make one for each application. That would defeat the whole purpose
of device-independent API of drivers.

(Yet another method would be vloopback device, but I don't know much about
it, wouldn't it be quite inefficient?)
Post by Attila Kinali
interface code, so porting code from one interface to another
(w/o optimizing to the new interface) is quite easy.
Maybe, if somebody does it. In many cases, nobody would do it, and there
would be cases "this web camera is supported by MEncoder but not xawtv".
Too little devices are supported in Linux anyway.
Post by Attila Kinali
The only place where a common api makes imho sense are the codecs
as this code is quite complex and only a few people have enough
Hmm, maybe I'm ignorant, but if there's a common API for codecs why not
other even simpler things too?
Post by Attila Kinali
knowledge to handle that stuff properly. But there, we already have
a quite common api for codecs: libavcodec
Yeah, and Gstreamer. Probably others too. Multiplicity is the problem.
Post by Attila Kinali
Post by Ronald Bultje
What I'd love to see is the codec API not being an actual lib, but just
a protocol, like X.
That's imho a quite bad idea. If you have ever written anything
time critical with using X11 than you know how much time you loose
just for the protokol conversion. Nevertheless X11 is a great protokol
I would understand "a protocol" same thing as "ABI". It's the API that
could be implemented differently e.g. in C++ and C and Python and Perl but
the protocol or ABI could be efficient, not necessarily pipe-oriented. When
passing around image buffers it certainly makes no sense to copy the data
through pipe if it's possible just to pass a pointer. This is how Gstreamer
already works even though it's "pipeline-oriented", as far as I understand.

What might make sense is to have a standard API but _not_ an ABI, because
it would allow just recompiling filters/codecs. I'm not sure if this makes
the problem any easier, though.
Artem Baguinski
2003-12-31 10:00:06 UTC
Permalink
Post by Tuukka Toivonen
Post by Attila Kinali
Beside imho it wouldn't be a good idea. The power of opensource
comes from diversity, if you restrict that by using a common plugin
reminds me of:

"once I thought that his was, not the Adamic language that a happy
mankind had spoken, all unified by a single tongue from the origin of
the world to the tower of Babel, or one of the languages that arose
after the dire event of their division, but precisely the Babelish
language of the first day after the divine chastisement, the language
of primal confusion."

although i don't have much faith in universal solutions either. if you
look at my ~/.signature you'll notice i try to speak a bit Babelian
myself ;-)

there was a meeting a couple of month ago in Bergen Norway
(http://www.piksel.no/) where real time video artists and coders
exchanged some ideas on make their software interoperable, not uniform.
but one of the conclusions was to work on common effect plugins
architecture similar to that of LADSPA in audio domain. the project is
located at http://savannah.nongnu.org/projects/piksel/ only part of
piksel framework is common effect plugins, there is also some [less
relevant for this discussion] work going on.

i'm afraid i'm a bit too late but there's also piksel mailing list
(http://plot.bek.no/mailman/listinfo/piksel) concerned with open source
tools for real time video processing and among other issues problems of
their interoperability and avoiding double work. i though [too late?] the
discussion of common codec architecture could happen there.
Post by Tuukka Toivonen
If this diversity means that things just don't work, I disagree. And I feel
that this would be the result, because there are devices that produce video
in their own formats. Before any application can do anything with
these devices, they must understand the format. This can be achieved by
either doing format conversion in the kernel--which isn't a good idea nor
even allowed by the master kernel hackers--or by a plugin system that is
adhered by all applications.
A plugin must be delivered along with a device driver, but it makes no
sense to make one for each application. That would defeat the whole purpose
of device-independent API of drivers.
(Yet another method would be vloopback device, but I don't know much about
it, wouldn't it be quite inefficient?)
vloopback is linux only driver that allows a software to act like a
video4linux device. among other things this allows construction of video
processing pipelines. it is:

- not maintained anymore
- not portable
- loads the kernel with things it shouldn't be doing

its purposes were:

- to make it possible to feed any video into programs that only expect
v4l input
- to add some pre/post processing to v4l devices outside of their
drivers [so that any "in software" processing happens outside the
driver and can be used with virtually any device].

success stories of vloopback use:
- exposing firewire camera as a video4linux device to xawtv [by some
bizarre reason different firewire cameras provide two different APIs
then video4linux, i'm not the only "lost in Babilon", it seems]
- sending output of my video processing application to ffmpeg and
streaming the encoded with ffserver

not sure how vloopback's relevant to the discussion [my explanation aims
to show its irrelevance ;)]
Post by Tuukka Toivonen
Post by Attila Kinali
interface code, so porting code from one interface to another
(w/o optimizing to the new interface) is quite easy.
Maybe, if somebody does it. In many cases, nobody would do it, and there
would be cases "this web camera is supported by MEncoder but not xawtv".
Too little devices are supported in Linux anyway.
yep, and

- this camera is firewire dv camera, it has this interface
- that camera is firewire dc camera, it has that interface
- yet another camera is usb webcam, it has yet another interface

but this is more a problem of the kernel i think. either video4linux API
weren't good enough for firewire or people how work on firewire drivers
were too creative...

may be the solution though is another layer, some video access library,
that would provide uniform access to various video inputs and having OS
dependant backends. e.g. OpenML (http://www.khronos.org/openml/) API
implementation that wraps existing APIs... analogous to mesa as it was
in the beginning [emulating OpenGL in software and allowing to render to
not-OpenGL capable hardware or drivers providing no OpenGL support for
capable hardware].

again, a nice issue to discuss but i dont see how it relates to the
Subject...
Post by Tuukka Toivonen
Post by Attila Kinali
The only place where a common api makes imho sense are the codecs
as this code is quite complex and only a few people have enough
Hmm, maybe I'm ignorant, but if there's a common API for codecs why not
other even simpler things too?
i can name three video applications with very different requirements to
"other simpler things":

- non-linear editing
- "normal" video playback
- realtime video manipulation

it's quite difficult to find common denominator for all three,
especially the universal one which would allow simple Lego like front
end coding.

on the other hand. projects like mplayer and xine has started with the
only goal in mind - to provide an open source media player that will
play all video sources, play it good and not crash. that's a noble goal
and the projects are quite successful in moving towards it.

the projects have developed the infrastructure and internal plugins APIs
optimized for "normal video playback", perfect audio/video
synchronization, onscreen display, subtitles support... any attempt to
find the least common denominator, invent common API and retrofit the
existing sucesful players to it would require a lot of work and will
make xine not xine anymore and mplayer - not mplayer. and their only
commons will ensure the new API is perfect for video playback and
useless [or hard to use] for other applications.

is such result worth the hassle?
Post by Tuukka Toivonen
Post by Attila Kinali
knowledge to handle that stuff properly. But there, we already have
a quite common api for codecs: libavcodec
and why every project includes its "personal" copy of the api's
implementation then?
Post by Tuukka Toivonen
Yeah, and Gstreamer. Probably others too. Multiplicity is the problem.
Gstreamer isn't a codec api, it's a glue layer implented in C.
Post by Tuukka Toivonen
Post by Attila Kinali
Post by Ronald Bultje
What I'd love to see is the codec API not being an actual lib, but just
a protocol, like X.
That's imho a quite bad idea. If you have ever written anything
time critical with using X11 than you know how much time you loose
just for the protokol conversion. Nevertheless X11 is a great protokol
are you sure "codec" is the correct word? i think the uniform media
access layer is missing, in xine terms - input, demux and codec part.

i really feel the need for such layer for the new open source media
applications, not for trying to recreate existing ones.

i guess before trying to invent it people should look at OpenML i
suggest. not for the implementation but for the protocol.
--
gr{oe|ee}t{en|ings}
artm
Cyrius
2003-06-28 22:37:02 UTC
Permalink
Subject: Re: [UCI-Devel] Re: Common Opensource codec
API
Date: Sat, 28 Jun 2003 13:51:23 -0500
----- Original Message -----
Date: 27 Jun 2003 15:13:14 +0200
Subject: [UCI-Devel] Re: Common Opensource codec API
What I'd love to see is the codec API not being an
actual lib, but just
a protocol, like X. Each application can then
write it's own
implementation code for this, and the codec API
can, if wanted, provide
a header file describing all structs (if any) etc.
used in the protocol.
Tying ourselves to one and the same lib likely
won't work, even now
we're already reproducing so much of the same
code... Do people agree
with this?
I was working with this kind of design before, but
nobody seemed interested in doing it. Basicly the
language of the API layer is immaterial, what
matters is the messages and data getting passed
through it (my system used a message/struct 2
parameter function for everything). If you set up
all the data as XML-like structs with messages that
tell you what you're expecting to see I think it can
be done language and platform independant at the
same time. Yes there will be some bloat from the ID
overhead, but as long as you're passing frames and
not say, individual pixels, it should be ok. There's
no reason you can't pass video frames, config info,
etc basicly as XML documents through an API designed
to handle them.


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Cyrius
2003-06-28 22:39:22 UTC
Permalink
Common Opensource codec API
Date: Sat, 28 Jun 2003 13:56:33 -0500
----- Original Message -----
Date: Sat, 28 Jun 2003 16:35:45 +0200
Subject: [UCI-Devel] Re: [matroska-devel] Re: Common
Opensource codec API
BBB, can you invest a couple of hours and come up
with a small doc
describing such a protocol, so it could be
discussed on the lists that
have been involved and were expressing interest in
such a solution ?
I can do this if you wanna hear about my design. I
still have most of it around.
Post by ChristianHJW
Post by Ronald Bultje
And just to state clearly: our final goal is to
propose a standardized
Post by ChristianHJW
Post by Ronald Bultje
API or interface of how codecs, muxer/demuxer
libraries etc. should look
Post by ChristianHJW
Post by Ronald Bultje
to be usable by our applications. It is not to
define how a bytestream
Post by ChristianHJW
Post by Ronald Bultje
should look. ;). Just so I (and you) know what
we're actually talking
Post by ChristianHJW
Post by Ronald Bultje
about.
UCI : codec API
UFI : filter API
UMI : muxing API, so that various containers could
be used from
Post by ChristianHJW
supporting apps
See I think the different operations like filters
and muxing should just be subsets of the message
space. Because a filter is going to have some
redundant use with a codec, such as
getting/recieving frames, colorspace conversion,
etc. It makes sense to just have them all share the
same functions, and just restrict which
messages/structs are valid to send to a given object
based on if it's a filter or a codec or whatever.
General type objects could accept any message, etc.
One of the reasons I was advocating a system with
2-way communication at each level was to do things
like have an app request an operation that a filter
does not support, and have the filter propose an
alternate operation by calling back other parts of
the system for more info. Thus programmers would be
free to do what they think is best for their code to
serve each message request.


__________________________________
Do you Yahoo!?
SBC Yahoo! DSL - Now only $29.95 per month!
http://sbc.yahoo.com
Ronald Bultje
2003-06-30 05:23:13 UTC
Permalink
Hey,
Post by Cyrius
See I think the different operations like filters
and muxing should just be subsets of the message
space.
This was my idea exactly. The muxing/codec operation subsets themselves
should be described in the protocol, too, but shouldn't introduce any
new features or methods, they should just implement a specific way of
using the generalized methods for their specific task.
Post by Cyrius
One of the reasons I was advocating a system with
2-way communication at each level was to do things
like have an app request an operation that a filter
does not support, and have the filter propose an
alternate operation by calling back other parts of
the system for more info. Thus programmers would be
free to do what they think is best for their code to
serve each message request.
Let's first keep it simple. ;).

Ronald
--
Ronald Bultje <***@ronald.bitfreak.net>
Pamel
2003-06-29 07:41:05 UTC
Permalink
See I think the different operations like filters and muxing should just
be subsets of the message space.

I have always been in complete agreement with this. Most data types that
are passed to/from a codec/filter/muxer are identical. It always seemed
kind of useless to develop three separate designs that do 98% of the same
work. They all need things like:

Timecode
Duration
Frame
Colorspace
Video Size
Status (Error, Ready, etc)

Maybe things like the CodecPrivate would only ever be passed to/from the
codec and the muxer, but never to the filter. There is enough data that
would be duplicated across all of the interfaces that seperating the API's
would be silly. More code and more specs for nothing.


Pamel
Benjamin Otte
2003-06-29 18:10:03 UTC
Permalink
Post by Cyrius
I was working with this kind of design before, but
nobody seemed interested in doing it. Basicly the
language of the API layer is immaterial, what
matters is the messages and data getting passed
through it (my system used a message/struct 2
parameter function for everything). If you set up
all the data as XML-like structs with messages that
tell you what you're expecting to see I think it can
be done language and platform independant at the
same time. Yes there will be some bloat from the ID
overhead, but as long as you're passing frames and
not say, individual pixels, it should be ok. There's
no reason you can't pass video frames, config info,
etc basicly as XML documents through an API designed
to handle them.
So we basically wrap this in CORBA then? ;)

Seriously: We need a simple little set of functions that a plugin needs to
implement. If it is not dead simple, nobody will implement it.
That was the important part: If it is not dead simple, nobody will
implement it. And that goes for apps _and_ plugins.

Benjamin
Guenter Bartsch
2003-06-29 19:38:10 UTC
Permalink
hallo benjamin,
Post by Benjamin Otte
Post by Cyrius
I was working with this kind of design before, but
nobody seemed interested in doing it. Basicly the
language of the API layer is immaterial, what
matters is the messages and data getting passed
through it (my system used a message/struct 2
parameter function for everything). If you set up
all the data as XML-like structs with messages that
tell you what you're expecting to see I think it can
be done language and platform independant at the
same time. Yes there will be some bloat from the ID
overhead, but as long as you're passing frames and
not say, individual pixels, it should be ok. There's
no reason you can't pass video frames, config info,
etc basicly as XML documents through an API designed
to handle them.
So we basically wrap this in CORBA then? ;)
*lol* ... yeah, overengineering seems to be quite common in those
"do-the-right-thing-fixed-forever-joined-effort" style aproaches :>
Post by Benjamin Otte
Seriously: We need a simple little set of functions that a plugin needs to
implement. If it is not dead simple, nobody will implement it.
That was the important part: If it is not dead simple, nobody will
implement it. And that goes for apps _and_ plugins.
my point exactly. this is just about defining simple, easy-to-use apis
for various multimedia plugins/modules. i too think we should just
define a basic set of functions which each plugin type should support,
not more. the api should be extensible, though - both, individual
implementations should be able to add fields and functions as well as
there should be a possibility to add (probably optional) functions
to the api in the future

over the weekend i have looked through mplayer g2's and xine's stream/input
and demux module apis and found them to be quite similar - it should
definitely be possible to define a common interface here. i'm planning
to set up a little website documenting the two aproaches, maybe i'll
also look at other media players (not sure how many aproaches i'll be
able to keep in my mind simultaneously ;) ). hope this will be a good
starting point for a common api

guenter
--
"Voraussagen sind ausserordentlich schwierig,
vor allem solche ueber die Zukunft." (N. Bohr)
Leif Johnson
2003-06-30 14:59:13 UTC
Permalink
You all might find some inspiration in LADSPA, a pretty nice audio plugin
API hammered out on the linux-audio-dev mailing list : http://ladspa.org/.
It's written for audio plugins only, but I thought some of the concepts
might be helpful for more generic types of plugins.

leif
Post by Guenter Bartsch
hallo benjamin,
Post by Benjamin Otte
Post by Cyrius
I was working with this kind of design before, but
nobody seemed interested in doing it. Basicly the
language of the API layer is immaterial, what
matters is the messages and data getting passed
through it (my system used a message/struct 2
parameter function for everything). If you set up
all the data as XML-like structs with messages that
tell you what you're expecting to see I think it can
be done language and platform independant at the
same time. Yes there will be some bloat from the ID
overhead, but as long as you're passing frames and
not say, individual pixels, it should be ok. There's
no reason you can't pass video frames, config info,
etc basicly as XML documents through an API designed
to handle them.
So we basically wrap this in CORBA then? ;)
*lol* ... yeah, overengineering seems to be quite common in those
"do-the-right-thing-fixed-forever-joined-effort" style aproaches :>
Post by Benjamin Otte
Seriously: We need a simple little set of functions that a plugin needs to
implement. If it is not dead simple, nobody will implement it.
That was the important part: If it is not dead simple, nobody will
implement it. And that goes for apps _and_ plugins.
my point exactly. this is just about defining simple, easy-to-use apis
for various multimedia plugins/modules. i too think we should just
define a basic set of functions which each plugin type should support,
not more. the api should be extensible, though - both, individual
implementations should be able to add fields and functions as well as
there should be a possibility to add (probably optional) functions
to the api in the future
over the weekend i have looked through mplayer g2's and xine's stream/input
and demux module apis and found them to be quite similar - it should
definitely be possible to define a common interface here. i'm planning
to set up a little website documenting the two aproaches, maybe i'll
also look at other media players (not sure how many aproaches i'll be
able to keep in my mind simultaneously ;) ). hope this will be a good
starting point for a common api
guenter
--
Leif Morgan Johnson : http://ambient.2y.net/leif/
Ronald Bultje
2003-12-28 16:36:12 UTC
Permalink
Hi all,
see BBB's C muxer/demuxer for gstreamer.
[***@shrek gst-plugins]$ cat gst-libs/gst/riff/riff-read.[ch]
gst/avi/gstavidemux.[ch] | wc -l
2630
[***@shrek gst-plugins]$ cat gst/matroska/ebml-read.[ch]
gst/matroska/matroska-demux.[ch] | wc -l
3684

Which I intend to port to ffmpeg (but somewhat differently, for your
pleasure). My GStreamer-coding style is quite different from ffmpeg (I
use more empty lines, less loops, more case/switch stuff, many different
functions for small things - all aimed at less maintainance, not less
code), so I'm guessing I can bring it down to 1500-2000 lines of code if
I adapt style a bit (comparable to the quicktime reader, for example).
Should be acceptable, no?

Anyway, I heard Xine is also developping all this, so I'm interested in
how they do this.

Ronald
--
Ronald Bultje <***@ronald.bitfreak.net>
Linux Video/Multimedia developer
ChristianHJW
2003-12-29 17:53:42 UTC
Permalink
Post by Guenter Bartsch
Post by Benjamin Otte
Post by Cyrius
I was working with this kind of design before, but
nobody seemed interested in doing it. Basicly the
language of the API layer is immaterial, what
matters is the messages and data getting passed
through it (my system used a message/struct 2
parameter function for everything).
So we basically wrap this in CORBA then? ;)
*lol* ... yeah, overengineering seems to be quite common in those
"do-the-right-thing-fixed-forever-joined-effort" style aproaches :>
This is beyond just "lol". It's more to the point of "let's commit the
drone who wrote such madness to an asylum". At the risk of getting
flamed, may I ask if the person who wrote this was a Matroska
developer?
Toby is not a matroska developer. He is the developer of a codec called
WARP ( http://corecodec.org/projects/warp ), and he was not capable of
finishing the codec because he was fighting months with the crappy,
limited and inflexible VfW API.
WARP needs an interface that does allow it to load a big number of
frames into a huge buffer, as the main compression algo of WARP will
work in the time domaine, means on the variation of a single pixel over
a number of frames.

This codec is very unusual in his basic functionality and Toby had to
fight with a limited codec API to be able to work with it, so you might
think that the developer behind it would know how a codec API should be
looking like to be future compatible.

After all, all our discussions are, again and again, basically going
about the same matter :

You guys always want to keep things extremely simple and performant, and
i have no idea why that is. Maybe you cant afford new PCs with
state-of-the-art CPUs, and are running your boxes on AMD K6's or
similar, dont know. I have a pretty old Pentium III system, only 800
MHz, with a 32 MB 4xAGP videocard, and even with crappy, bloated Windows
+ plus even more bloated, even more crappier matroska i was never ever
close to have CPU performance problems. So, there seems to be a clear
conflict of interests, for whatever reason.

matroska people are primarily not interested in the performance of our
code, not at all. We are thinking 10 years ahead, and compared to the
necessary decoding power to play a HDTV ( 1280 x 960 ) h.264 ( AVC,
MPEG4-10 ) video, the CPU cycles you will need to parse a matroska file
or to use a CORBA wrapped codec API, will just be peanuts, so why should
we care at all .... ??
Post by Guenter Bartsch
Post by Benjamin Otte
Seriously: We need a simple little set of functions that a plugin needs to
implement. If it is not dead simple, nobody will implement it.
That was the important part: If it is not dead simple, nobody will
implement it. And that goes for apps _and_ plugins.
my point exactly. this is just about defining simple, easy-to-use apis
for various multimedia plugins/modules. i too think we should just
define a basic set of functions which each plugin type should support,
not more. the api should be extensible, though - both, individual
implementations should be able to add fields and functions as well as
there should be a possibility to add (probably optional) functions
to the api in the future
This is stupid. You can just wrap the code instead. A basic/dumb
filter will be easy to wrap, and a more complicated one will not
easily fit into a common api anyway.
Again the same conflict. In your 'boneheaded' concentration on
simplicity and performance, you are prefering to have an API that cant
deal with some special filters. For us, a filter API that cant deal with
*ANY* possible filter we can think of today, is just crap. If it cant
deal with today's filters, how could you expect the API to hold longer
than 2 years, with respect to today's development speed ?

I cant get rid of my impression that at least *some* devs here are so
proud they finally understood how MPEG works ( i dont ;-) ), they are
now much too focussed on the MPEG way and how things are generally done
today. You may call matroska bloated and non-performant. Well, maybe
thats the case. However, i promise you matroska will be still here in 5
years, it will be looking completely different than today as we had to
adapt it to the needs of tomorrow, but all old files will still be 100%
compatible thanks to the very flexible ( = bloated, in your opinion )
underlying EBML structure. We'll see ....
Post by Guenter Bartsch
over the weekend i have looked through mplayer g2's and xine's stream/input
and demux module apis and found them to be quite similar - it should
definitely be possible to define a common interface here. i'm planning
to set up a little website documenting the two aproaches, maybe i'll
also look at other media players (not sure how many aproaches i'll be
able to keep in my mind simultaneously ;) ). hope this will be a good
starting point for a common api
There will be no common api, unless xine wants to adopt mplayer api.
Fortunately, you are not the one deciding this. I am glad the discussion
was started again, mainly with respect to Atilla's planned video
developer meeting in Switzerland next year. I sincerely hope that the
conversation about a new API will not stop, and that we can move things
forward until then.

If the mplayer people will not be interested to set up a specific
mailing list for that, we will do so soon.
Common api means the EXACT same thing as a common player. If mplayer
and xine are using the same stream layer, the same demuxer layer, the
same codec and filter layer, and the same output modules, then the
ONLY difference between the two is the 10 lines of wrapper code in
main.c to put it all together. This is beyond idiotic. MPlayer is
better than the stupid windows crap because it _doesn't_ just wrap
DirectShow with gui widgets, but instead implements everything itself,
and does so much more efficiently and more correctly.
Rich
Yes, maybe. So finally Linux players could start to improve and try to
offer all the functionalities that are standard in the Windows world
since ages. Why that is ? Because the main playback function, however
crappy and bloated it is implemented, is done nicely by DirectShow and
player developers can concentrate on improving the user interface and
add more features, like live capturing, playlists, timeshifting, remote
control, etc . ....

Christian
matroska project admin
http://www.matroska.org
Attila Kinali
2003-12-29 17:53:52 UTC
Permalink
On Sun, 28 Dec 2003 10:09:44 +0100
Post by ChristianHJW
You guys always want to keep things extremely simple and performant, and
i have no idea why that is. Maybe you cant afford new PCs with
state-of-the-art CPUs, and are running your boxes on AMD K6's or
similar, dont know. I have a pretty old Pentium III system, only 800
MHz, with a 32 MB 4xAGP videocard, and even with crappy, bloated Windows
+ plus even more bloated, even more crappier matroska i was never ever
close to have CPU performance problems. So, there seems to be a clear
conflict of interests, for whatever reason.
ROTFL

Sorry, but that's just...
You know, a program needs to run as fast as possible, no matter what
it does, otherwise we end up waiting for the computer and justify
Wirth's law ("programs slow down faster than computers get faster").
Beside, not everyone has the money to buy a top computer like you do.
You may dont know that computer components cost more (2-3 times more)
in countries like Turkey than they do here (not to mention that they
earn 10 times less). Your comment reminds me Marie Anntoinette's
"Why dont they eat cakes if they dont have bread ?". It has the same
ignorant and arrogant sound.
Post by ChristianHJW
matroska people are primarily not interested in the performance of our
code, not at all. We are thinking 10 years ahead, and compared to the
necessary decoding power to play a HDTV ( 1280 x 960 ) h.264 ( AVC,
MPEG4-10 ) video, the CPU cycles you will need to parse a matroska file
or to use a CORBA wrapped codec API, will just be peanuts, so why should
we care at all .... ??
Yes, we also design for the future, but keep in mind that the time
for THz processors hasn't yet come. A good engineer always gets
the best performance out of a machine without sacrifizing functionality,
but at the same time he doesnt overcomplicate things just because
it might be usefull in an unknown future.
Post by ChristianHJW
I cant get rid of my impression that at least *some* devs here are so
proud they finally understood how MPEG works ( i dont ;-) ), they are
now much too focussed on the MPEG way and how things are generally done
today. You may call matroska bloated and non-performant. Well, maybe
thats the case. However, i promise you matroska will be still here in 5
years, it will be looking completely different than today as we had to
adapt it to the needs of tomorrow, but all old files will still be 100%
compatible thanks to the very flexible ( = bloated, in your opinion )
underlying EBML structure. We'll see ....
Ok, just have a look a the code (i hope you can read code)
demux_avi.c is around 850 lines of code and contains everything
needed for avi demuxing.
demux_mkv.c is over 3100 lines of code, which are mostly hard to understand
(IMHO) and is only a wrapper for the matroska libs which themself contain
again about 10k lines of code. I know, mkvs functionality is much higher
than avis, but imho it doesnt justify 13k lines of code (in comparison to
this, demux_nut.c from G2 is just 500 loc and has about the same functionality)
and a 1:3 ratio of loc in the libs and corresbonding code in the program itself
is for me sign that something fundamental went wrong. Maybe someone (with
the heart of an engineer) should write a pure c implementation of your
libs to see whether it's the design of mkv or just the libs.
Post by ChristianHJW
Fortunately, you are not the one deciding this. I am glad the discussion
was started again, mainly with respect to Atilla's planned video
developer meeting in Switzerland next year. I sincerely hope that the
conversation about a new API will not stop, and that we can move things
forward until then.
Just go on...
Post by ChristianHJW
Common api means the EXACT same thing as a common player. If mplayer
and xine are using the same stream layer, the same demuxer layer, the
same codec and filter layer, and the same output modules, then the
ONLY difference between the two is the 10 lines of wrapper code in
main.c to put it all together. This is beyond idiotic. MPlayer is
better than the stupid windows crap because it _doesn't_ just wrap
DirectShow with gui widgets, but instead implements everything itself,
and does so much more efficiently and more correctly.
Rich
Yes, maybe. So finally Linux players could start to improve and try to
offer all the functionalities that are standard in the Windows world
since ages. Why that is ? Because the main playback function, however
crappy and bloated it is implemented, is done nicely by DirectShow and
player developers can concentrate on improving the user interface and
add more features, like live capturing, playlists, timeshifting, remote
control, etc . ....
Then why do more and more windows user use mplayer eventhough there
is no nice gui available on windows ?


Attila Kinali
--
egp ist vergleichbar mit einem ikea bausatz fuer flugzeugtraeger
-- reeler in +kaosu
Steve Lhomme
2003-12-29 17:54:02 UTC
Permalink
I haven't read all threads of all MLs (yes, a central point to discuss
that would be a good idea). But let me explain in short why I think a
standardisation, not just of the codec system, is needed.

* It's about not reinventing the wheel for every new project (we all
made mistakes that can now be avoided to others).
* It's about making sure that if a project becomes dead all the work
done on it can be reused easily on another one.
* It's to make sure that once a project get a new feature all other ones
can quickly benefit from it.

Instead of sterile competition between each other (all going the same
way but in parallel not incrementally), we'd rather compete with
commercial products (Quicktime+iTunes, Windows multimedia computers, etc).
John Cannon
2003-12-29 17:54:14 UTC
Permalink
Post by Attila Kinali
Then why do more and more windows user use mplayer eventhough there
is no nice gui available on windows ?
I like mplayer because of it's simplicity and it's support for nearly
anything I throw at it. Can we please stop bashing everything and focus
on the API topic :) I have started playing with an API myself but I
know it's far from perfect. And surely you all will find something
wrong with the way I implemented it. But it's not aimed (at least by
me) to be THE ultimate solution to this problem. I just wanted to make
my own encoder app. While designing this, I have noticed that
libavcodec and libavformat are nice but still a little lacking.
libavformat is getting better IMO with the introduction of the
read_frame functions. As far as I can see, neither is pluggable,
allowing codecs or demuxers/muxers to be added later which is a little
limiting IMO. My little API simply makes a miniature COM-like interface
for shared libraries to allow plugin loading. It's very small and
simple IMO. Of course it's not as simple as a plain C library but it
provides a bit more flexibility. Right now I only have demuxer and
muxer interfaces defined though. And I use a C++ wrapper on the
application side so I know you will hate that ;)

John Cannon
Matroska Team

PS: I think the reason libmatroska is so large is that it is aimed at
easy extension and not so much toward small code size. Of course it can
be smaller, see BBB's C muxer/demuxer for gstreamer. libmatroska uses a
lot of templates and stuff which i get confused by but it's all meant to
be a reference implementation, one that correctly builds and handles
matroska. Gabest and alexnoe have bothe coded their own implementations
and I am quite sure they're less than 10k lines of code.
ChristianHJW
2003-12-29 17:54:55 UTC
Permalink
Hi again,

this email is the last one that i will x-post to several lists. We set
up a new ML specifically for the subject of the new media API, email
adress of the list is

media-api AT lists.matroska.org

For subscription goto http://lists.matroska.org ( mailman ), and
everybody who is really interested in this subject is invited to do so.
One of the biggest benefits of this new list is that Mr. Felker will
hopefully not subscribe to it, so i dont have to read his brainless and
insulting '... we mplayer devs are the best, all others suck... ' BS.
mmm ... a programmer not striving for simplicity ....
I fear your code.
I am unsatisfied with ogm and wanted to give a try
to the mkv thing but now i am thinking again.
(Has anyone tryed quicktime .mov or mp4 ?)
I humbly suggest you (and every other programmer
that has not yet done) to read
The Practice of Programming
by Brian W. Kernighan (Author), Rob Pike (Author)
ISBN: 020161586X
Marco,
to make this clear, i am *NOT* a developer myself. I am just a helping
hand for the matroska team, and i care about the website from time to
time ( yes, i know it needs an update badly ;-) ). I know i am talking
with a different language than you developers sometimes, but this is
simply because i see things with the eyes of the *USER*, and i hope the
matroska devs will confirm that this is sometimes pretty helpful. After
all, its the *USERS* who decide if a program is good or bad, not the
developers making it or the coding style that was used.
About the book you were suggesting, i can tell you that the people
working in the matroska team are certainly aware of the principles of
good programming. But, if we have to make a decision between a powerful
container, feature rich and user friendly, with full unicode support for
international use, and a poor and simple extension to the current MPEG
container with low CPU usage, we know what we are going for.
I made a couple of test. I was loading a normal MKV, with one XviD video
track, one audio track and 2 subtitle tracks, in VirtualdubMod, and
remuxed it into a new MKV file by leaving away one of the subs tracks.
This requires a complete demux/mux , as the new file will have a
different number of tracks and therefore completely new headers, etc.
Here the result of the tests on my PIII 800 Mhz with a normal SD-RAM :
Muxing Speed : 850 fps
CPU load : ca. 30% ( for both demux and mux with libmatroska in both
cases, plus editing in VirtualdubMod )
In short, i think its allowed to say that under normal circumstances the
playback of this file with 25 fps will use less than 1% CPU in any case,
and this was if 'bloated' libmatroska was used for that. In reality
there are at least 4 existing faster playback libs now, and 3 of them
are in C ( soon in FFMPEG ). So, if anybody tells you that matroska is
bloated and playback is using too much CPU power, tell him to shut up.
For your information, since last week there is an alpha of PocketMVP
with matroska support, using BBB's C library. So, matroska playback on
PocketPCs should soon become reality. Any more doubts with respect to
'performance' ???
ROTFL
Sorry, but that's just...
You may dont know that computer components cost more (2-3 times more)
in countries like Turkey than they do here (not to mention that they
earn 10 times less). Your comment reminds me Marie Anntoinette's
"Why dont they eat cakes if they dont have bread ?". It has the same
ignorant and arrogant sound.
Sorry, i didnt mean to sound arrogant, not at all. Just as a quick test,
i was installing the latest matroska packs on the laptop of my son (
PIII 650, 128 MB RAM, 12 GB HDD, 16 MB video ) that i had bought him for
his learning programs for his 5th b-day ( did cost me 150,- €, used and
in bad shape, from my company ). I could easily play a MKV file with RV9
( RealVideo 9 ) in 708 x 432 ( anamorphic ) and HE-AAC. CPU load was
constantly above 80%, true, but the file played fine. As stated above,
even PocketPCs can now play MKV files, admittedly using a nice and slim
C lib and not libmatroska, so what more do you guys want ?

Anyhow, in my original email i wanted to make clear that a common API
might help to improve development of opensource projects a lot, and this
means not only players, but also codecs and editors, and the possible
constraints on performance should be neglectible, and well worth the
advantages we get.

Again, this was my personal opinion, and everybody sharing it is happily
invited to continue discussion on the new mailing list mentioned on the
top of this email.

Best regards

Christian
matroska project admin
http://www.matroska.org
Toby Hudon
2003-06-30 16:53:04 UTC
Permalink
----- Original Message -----
From: Ronald Bultje <***@ronald.bitfreak.net>
Date: 30 Jun 2003 09:23:19 +0200
To: Cyrius <***@yahoo.com>
Subject: Re: [gst-devel] Fwd: Re: [UCI-Devel] Re: [matroska-devel] Re: Common Opensource codec API
Post by Ronald Bultje
Hey,
Post by Cyrius
See I think the different operations like filters
and muxing should just be subsets of the message
space.
This was my idea exactly. The muxing/codec operation subsets themselves
should be described in the protocol, too, but shouldn't introduce any
new features or methods, they should just implement a specific way of
using the generalized methods for their specific task.
Post by Cyrius
One of the reasons I was advocating a system with
2-way communication at each level was to do things
like have an app request an operation that a filter
does not support, and have the filter propose an
alternate operation by calling back other parts of
the system for more info. Thus programmers would be
free to do what they think is best for their code to
serve each message request.
Let's first keep it simple. ;).
Ronald
--
The main idea was to keep it simple. The problem is when you make everything one-way like VFW it becomes not simple to do certain things. I had simplicity as the main goal in mind when thinking this through.
--
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup
Toby Hudon
2003-06-30 16:55:03 UTC
Permalink
----- Original Message -----
From: Leif Johnson <***@ambient.2y.net>
Date: Mon, 30 Jun 2003 12:53:36 -0400
To: ***@ra.informatik.uni-stuttgart.de
Subject: Re: [xine-devel] Re: [gst-devel] Re: [UCI-Devel] Re: Common Opensource codec API
Post by Leif Johnson
You all might find some inspiration in LADSPA, a pretty nice audio plugin
API hammered out on the linux-audio-dev mailing list : http://ladspa.org/.
It's written for audio plugins only, but I thought some of the concepts
might be helpful for more generic types of plugins.
leif
Sorry I'm not a linux user so I probably wouldn't be able to do much with it. Can you give me a quick overview?
--
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup
ChristianHJW
2003-12-29 17:53:12 UTC
Permalink
@mplayer devs :

guys, what do you think about a new mailing list on your server, where
everybody interested in the subject, from all the other involved
projects, can subscribe to ? Its not really making a lot of sense IMO to
x-post over several MLs to discuss this ( interesting and important )
subject ?

What about media-***@mplayerhq.hu

or the like ?

Regards

Christian
Post by Cyrius
----- Original Message -----
Date: Mon, 30 Jun 2003 12:53:36 -0400
Subject: Re: [xine-devel] Re: [gst-devel] Re: [UCI-Devel] Re: Common Opensource codec API
Post by Leif Johnson
You all might find some inspiration in LADSPA, a pretty nice audio plugin
API hammered out on the linux-audio-dev mailing list : http://ladspa.org/.
It's written for audio plugins only, but I thought some of the concepts
might be helpful for more generic types of plugins.
leif
Sorry I'm not a linux user so I probably wouldn't be able to do much with it. Can you give me a quick overview?
Toby Hudon
2003-06-30 17:01:13 UTC
Permalink
----- Original Message -----
From: Benjamin Otte <***@public.uni-hamburg.de>
Date: Sun, 29 Jun 2003 22:08:32 +0200 (DFT)
To: Cyrius <***@yahoo.com>
Subject: Re: [gst-devel] Fwd: Re: [UCI-Devel] Re: Common Opensource codec API
Post by Benjamin Otte
Post by Cyrius
I was working with this kind of design before, but
nobody seemed interested in doing it. Basicly the
language of the API layer is immaterial, what
matters is the messages and data getting passed
through it (my system used a message/struct 2
parameter function for everything). If you set up
all the data as XML-like structs with messages that
tell you what you're expecting to see I think it can
be done language and platform independant at the
same time. Yes there will be some bloat from the ID
overhead, but as long as you're passing frames and
not say, individual pixels, it should be ok. There's
no reason you can't pass video frames, config info,
etc basicly as XML documents through an API designed
to handle them.
So we basically wrap this in CORBA then? ;)
Seriously: We need a simple little set of functions that a plugin needs to
implement. If it is not dead simple, nobody will implement it.
That was the important part: If it is not dead simple, nobody will
implement it. And that goes for apps _and_ plugins.
Benjamin
Is one function simple enough? ;)

Seriously, do(UINT msg, pDOC data); That's it. Internal switch on the message to handle the data if needed. If you don't support a given message in the API, you go to the default case which is return(0). Whoever called you has the job of handling an unexpected return(0), such as trying an alternate method, and if they can't they also return(0) and pass it up the call stack until it gets to the application, which either decides to try something else or returns a message saying not supported or error or whatever is appropriate.

All components (by component I mean a generic object for the API that can be a container, filter, codec, muxer, etc) support this one function. The only other function in the entire API so far is the function to create a component based on information. I.e. the application sees it needs a codec to handle "XVID" output from the container component it just made. It performs a lookup (I'm still working on the exact details of how best to do it) and finds that it points to a DLL that handles this. It then calls an API function that creates the component associated with that dll, i.e. Component my_decoder = CreateComponent("xvid_x86.dll");

That's it, two functions. The complicated part is in defining the messages and their data, but as long as each component is only required to support the messages it can do something with, and not handle exceptions, it should be easy for component developers.

If you have a simpler idea please let me know, and BTW what is CORBA?
--
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup
Ronald Bultje
2003-06-30 19:48:03 UTC
Permalink
Hey Toby,
Post by Toby Hudon
Is one function simple enough? ;)
Seriously, do(UINT msg, pDOC data); That's it. Internal switch
on the message to handle the data if needed. If you don't support a
given message in the API, you go to the default case which is
return(0). Whoever called you has the job of handling an unexpected
return(0), such as trying an alternate method, and if they can't
they also return(0) and pass it up the call stack until it gets to
the application, which either decides to try something else or
returns a message saying not supported or error or whatever is
appropriate.
The only thing that this solves is the API linking issue. And that's not
an issue anyway if we're going to define a standard API.
Post by Toby Hudon
That's it, two functions. The complicated part is in defining
the messages and their data, but as long as each component is only
required to support the messages it can do something with, and not
handle exceptions, it should be easy for component developers.
But that's the whole issue, the messages. You're just moving the problem
over from linkage space to message space. The problem here is actually
to define how all this works. ;).

Funnily, you're taking XVID as an example, but xvid is horrible. Look at
the API. There's one struct per message. No extendibility. No
configurability. Perfect for a one-task API, but not at all usable for a
generic API.

GStreamer (GObject)'s way of defining all this is easier. Make
name/value pairs, make a standard doc that describes which types support
what base properties. Since there's no limit on name/value pairs,
there's no limit in extendibility. Making this two-way allows the
application to view these properties, too.
And, even better, this method doesn't depend on GStreamer or GObject at
all. GStreamer just implements it in that specific way.

The good thing of your approach is actually its simplicity, and I like
that. However, I wouldn't vote for sticking to one function, since that
only moves the definition problem over from linkage space to message
space, and that has no advantage, imho. However, the idea of using as
little functions as possible up to a certain amount sounds good, I'll
try sticking to that. :).
Post by Toby Hudon
If you have a simpler idea please let me know, and BTW what is
CORBA?
Something like COM, but for Unix/Linux.

Ronald

PS Christian, you've asked me to come up with a document, I'll try to do
that in the course of this week. Give me a few days, I have a job, too.
;). I'll take some ideas from the list and mix them together with some
of my own.
--
Ronald Bultje <***@ronald.bitfreak.net>
Toby Hudon
2003-07-01 15:39:11 UTC
Permalink
----- Original Message -----
From: Ronald Bultje <***@ronald.bitfreak.net>
Date: 30 Jun 2003 23:17:44 +0200
To: Toby Hudon <***@mail.com>
Subject: Re: [gst-devel] Fwd: Re: [UCI-Devel] Re: Common Opensource codec API
Post by Ronald Bultje
But that's the whole issue, the messages. You're just moving the problem
over from linkage space to message space. The problem here is actually
to define how all this works. ;).
Well it's got to be somewhere. There's obviously a fairly big number of things to do, get frames, configure things, get keyframes, etc. It's got to be somewhere, and I figured defined message constants are easier to type than function calls and their parameters without errors. Also if we're going to assume that for simplicity not everyone will have to implement every possible function, what happens when an app calls a function that a component doesn't support? Does the system bomb or what? With the message system the caller just looks at the return value to know what's going on, there's no chance of hitting a function without a definition.

Unless you expect people to have say, 200 functions most of which are empty... that seems kinda redundant to me when a switch's default case is alot cleaner and has the same effect.
Post by Ronald Bultje
Funnily, you're taking XVID as an example, but xvid is horrible. Look at
the API. There's one struct per message. No extendibility. No
configurability. Perfect for a one-task API, but not at all usable for a
generic API.
Sorry I just picked a codec name at random, I haven't looked at their API that much but from what I've seen trying to write my VFW code based on theirs it's most likely a mess as you say.
Post by Ronald Bultje
GStreamer (GObject)'s way of defining all this is easier. Make
name/value pairs, make a standard doc that describes which types support
what base properties. Since there's no limit on name/value pairs,
there's no limit in extendibility. Making this two-way allows the
application to view these properties, too.
And, even better, this method doesn't depend on GStreamer or GObject at
all. GStreamer just implements it in that specific way.
Never seen it, or even heard of it for that matter. What do you mean by name/value pairs, you mean just variables? How is this better or different than what I was talking about? You'd still need to define a standard doc (i.e. a bunch of constants defined for messages and their data structs/docs), and as for what supports what you find out when you try to pass a message, if it returns 0 it's unsupported. You as the caller get the job of handling it if that's a problem.
Post by Ronald Bultje
The good thing of your approach is actually its simplicity, and I like
that. However, I wouldn't vote for sticking to one function, since that
only moves the definition problem over from linkage space to message
space, and that has no advantage, imho. However, the idea of using as
little functions as possible up to a certain amount sounds good, I'll
try sticking to that. :).
Well I could see segregating different types of operations into different functions, such as the linking of dynamic libraries, and I've already said that'd probably want to be seperate. But for the actual video processing itself, why would you put it in multiple functions? Since filters, codecs, etc all have similar needs, doesn't it make more sense for them to just be able to pick and choose from the defined operations instead of forcing them to implmement functions they might not need? I still don't know what happens when you call a function that isn't there, can you please tell me? I haven't tried this but I'd suspect it isn't pretty.
Post by Ronald Bultje
Post by Toby Hudon
If you have a simpler idea please let me know, and BTW what is
CORBA?
Something like COM, but for Unix/Linux.
What's COM?
--
__________________________________________________________
Sign-up for your own FREE Personalized E-mail at Mail.com
http://www.mail.com/?sr=signup
Toby Hudon
2003-12-29 17:55:20 UTC
Permalink
----- Original Message -----
From: Attila Kinali <***@kinali.ch>
Date: Sun, 28 Dec 2003 16:28:38 +0100
To: mplayer-dev-***@mplayerhq.hu
Subject: [UCI-Devel] Re: [MPlayer-dev-eng] Re: Common Opensource codec API
Post by Attila Kinali
On Sun, 28 Dec 2003 10:09:44 +0100
Post by ChristianHJW
You guys always want to keep things extremely simple and performant, and
i have no idea why that is. Maybe you cant afford new PCs with
state-of-the-art CPUs, and are running your boxes on AMD K6's or
similar, dont know. I have a pretty old Pentium III system, only 800
MHz, with a 32 MB 4xAGP videocard, and even with crappy, bloated Windows
+ plus even more bloated, even more crappier matroska i was never ever
close to have CPU performance problems. So, there seems to be a clear
conflict of interests, for whatever reason.
ROTFL
Sorry, but that's just...
You know, a program needs to run as fast as possible, no matter what
it does, otherwise we end up waiting for the computer and justify
Wirth's law ("programs slow down faster than computers get faster").
Beside, not everyone has the money to buy a top computer like you do.
You may dont know that computer components cost more (2-3 times more)
in countries like Turkey than they do here (not to mention that they
earn 10 times less). Your comment reminds me Marie Anntoinette's
"Why dont they eat cakes if they dont have bread ?". It has the same
ignorant and arrogant sound.
No, not everything needs to run as fast as possible if it has too many other tradeoffs. If it did then why aren't you hand coding everything in assembly instead of wasting your time with things like C?

I can't speak for Matroska's design because I'm not all that familliar with it, but I designed my project with readibility of code and scalability of the algorithm and the entire thing including the interface isn't even 3100 lines if I remember right.

Moore's law makes CPU speed effectively infinite given enough time. Programmer man-hours however, will always be finite. Designing systems that are easy to understand and extend makes them easier to go back and optimize or improve later. Speed can be important but writing it so it can be understood, even if you go back and comment that to put in a faster implementation, is often even more important.
--
___________________________________________________________
Sign-up for Ads Free at Mail.com
http://promo.mail.com/adsfreejump.htm
Loading...