|
|
|
Microsoft and 3D Graphics: A Case Study in Suppressing Innovation and Competition |
|
|
|
by Alan Akin |
|
F
|
orewordThe original version of this article was written in July, 1997, and is reproduced (with minor changes to improve continuity and to update hyperlinks) in the first few sections following the Foreword. If you're familiar with the original version, feel free to skip to the August, 1998 Update; otherwise, please proceed to the Introduction. IntroductionIn the last five years, three-dimensional computer graphics has become a critical technology in the PC world. 3D graphics capability is a prerequisite for two major markets: Computer-aided industrial and mechanical design (for which the majority of workstations and high-end PCs are sold today) and entertainment (especially computer games, which are driving significant new PC hardware and software purchases by consumers). Examining Microsoft's treatment of 3D graphics technology is especially instructive, because so many of Microsoft's actions in that area have been exposed to public scrutiny. Technical BackgroundWe must define two technical concepts in order to understand the issues in detail. Application Programming InterfacesFirst is the notion of an Application Programming Interface, or API. An API is essentially a library of small functions that a software developer pieces together to form a complete computer program. For example, a 3D graphics API might include functions that draw objects, simulate the effects of lights, and determine which objects are visible from a given point of view. A software developer could use these functions to build a flight simulation game: drawing mountains and buildings that would be visible through the cockpit canopy of an airplane, and lighting them as they would be lit by the sun. The design of a 3D graphics API determines what can be drawn, how quickly it can be drawn, and how easy it is for a software developer to achieve a desired effect. Thus the API is of fundamental technical and business importance to the software developers who use it. Device DriversSecond is the notion of a device driver. A device driver (driver for short) is a piece of software that implements an API on a given hardware device. For example, the functions of a 3D graphics API might be implemented differently on the video card in a Compaq PC than they would be implemented on another video card in a Dell PC. In such a case, the two hardware vendors (Compaq and Dell) would each write drivers for the 3D graphics API on their respective video cards. Note that the API would be used in the same way on both machines; only its implementation details (the drivers) would differ. A driver is key to the viability of its associated API and all the application programs that use it. The existence of a driver determines whether a given API is even available on a particular machine. The quality of the driver (its completeness, performance, and reliability) in large measure determines the quality of the software that uses it on that machine. A driver typically requires a significant amount of effort for development and testing, and is therefore of major importance to the hardware vendor who must supply it as well as the software developers and consumers who will use it. HistoryThe OpenGL API for 3D graphicsBy 1992 it was clear that 3D graphics was poised to become a critical technology in several markets. Requests from independent software vendors led a consortium of companies to agree to support a common 3D graphics API. This new API, derived from a popular, older graphics library created by Silicon Graphics, was called OpenGL. OpenGL was to be a state-of-the-art API that could be implemented efficiently on a wide variety of computers. Its specification would be controlled by a committee (known as the Architecture Review Board, or ARB) rather than by (and for the benefit of) a single vendor. It would be easy for any hardware vendor to extend OpenGL to accommodate innovative new features, thus allowing rapid development of new applications. The original members of the ARB were Digital Equipment Corporation, International Business Machines, Intel, Microsoft, and Silicon Graphics. At this time Microsoft was developing the first version of its new high-end operating system, Windows NT. A large part of the computer-aided design market was inaccessible to Microsoft because of technical shortcomings in Windows 3.1, and Microsoft sought to remedy those in Windows NT. One such shortcoming was the lack of good 3D graphics support, and OpenGL offered an expedient solution. The Problem of OpenGL Driver SupportAs part of a joint engineering project, Microsoft and Silicon Graphics produced an implementation of OpenGL for Windows NT. One feature of this implementation was a new (to Microsoft) device driver design, called an Installable Client-side Driver (ICD), which offered higher graphics performance and permitted any PC hardware vendor to extend the OpenGL API to support new and innovative 3D graphics functionality. This was in contrast to the usual Microsoft driver model, which was appropriate for more modest graphics performance requirements and in which Microsoft reserved exclusive control over the design of the driver and thus of the API it supported. The 3D graphics hardware business for PCs (particularly low-cost PCs) evolved much more slowly than Microsoft and Silicon Graphics anticipated. A few PC hardware vendors (notably Digital Equipment, Intergraph, and 3Dlabs) provided capable OpenGL drivers for machines running Windows NT, but low-cost, high-volume solutions for consumers using Windows 3.1 and Windows 95 were slow to arrive. As 3D graphics hardware became more generally available in 1996, pressure to provide fast, stable implementations of OpenGL increased, and Microsoft developed a new OpenGL device driver design called the Mini Client Driver (MCD). The MCD greatly reduced the time required to produce a quality OpenGL implementation for a large class of PC graphics cards, and thus had the potential to dramatically increase the availability of OpenGL. Microsoft Creates the Direct3D APIIn 1995 and 1996 Microsoft established a new program to support games on PCs running its Windows 95 operating system. The goal was to expand the market for PCs into the area then dominated by game consoles such as those from Nintendo and Sega. Microsoft chose not to use the OpenGL technology it already provided in Windows NT to handle 3D graphics for games. Instead, Microsoft purchased Rendermorphics, Ltd. and acquired its 3D graphics API known as RealityLab. Microsoft reworked the device driver design for RealityLab and announced the result as a new 3D graphics API called Direct3D Immediate-Mode (Direct3D). Leveraging Windows 95 to Promote Direct3D and Freeze OpenGLMicrosoft refused to release the software needed to support OpenGL-based games on Windows 95. In fact, for a considerable time Microsoft chose not to support OpenGL on Windows 95 at all, which made it impossible for users of OpenGL-based applications on Windows NT to run them on Windows 95. Microsoft also took the unusual step of retracting its support for MCD drivers for OpenGL, even though it had already released kits to hardware developers. As a consequence, some hardware developers were forced to recall OpenGL drivers that were already in the beta-test phase. Microsoft's actions partitioned the 3D graphics market, guaranteed that OpenGL would not be widely available on high-volume PCs targeted by Windows 95, and leveraged Windows 95 to boost the overall market penetration of Direct3D. Microsoft marketing teams began to promote the proprietary Direct3D API to games developers, hardware developers, and the trade press, while simultaneously marginalizing OpenGL. If Microsoft mentioned OpenGL at all, it was presented as a low-performance API that was suited only for certain professional computer-aided-design applications on Windows NT, while Direct3D was "mainstream'' and offered ``real-time'' performance on the much more heavily-hyped Windows 95 operating system. (This despite the widespread use of OpenGL in high-performance applications with close technical similarities to games, such as flight simulators.) Microsoft also increased its commitment of staff to Direct3D while freezing the level of staffing for OpenGL, with the result that OpenGL development slowed relative to Direct3D. API WarsSilicon Graphics and many other users of OpenGL have businesses that depend on the ability to offer innovative and high-performance graphics technology. As it became clear that Microsoft intended to replace OpenGL with Direct3D, that Direct3D suffered from many technical shortcomings, and that (unlike OpenGL) Direct3D could not be extended by hardware vendors because it was controlled completely by Microsoft, Silicon Graphics decided to mount a demonstration at the 1996 Special Interest Group on Computer Graphics (SIGGRAPH) conference in New Orleans. The demonstration showed conclusively that OpenGL was at least as fast as Direct3D, thus refuting Microsoft's key marketing claim. Since OpenGL was already acknowledged (by Microsoft among others) as having more functionality than Direct3D, and potentially higher image quality than Direct3D, the demonstration precipitated an intense debate in the computer graphics and game development communities: Why was Microsoft promoting a new, less-capable API, and withholding already-existing device driver technology that could allow its customers to use the superior product? Much of the public discussion took place in the comp.graphics.api.opengl and rec.games.programmer Usenet newsgroups, and is accessible from DejaNews. (If you choose to research this, be prepared for a great deal of reading! Consider searching for the thread titled "DirectX vs OpenGL'' in the comp.graphics.api.opengl newsgroup starting around August of the year 1996.) Game Developers Ask for OpenGL on Equal Footing with Direct3DAs the technical and marketing issues were exposed, a strong pro-OpenGL reaction began. John Carmack of id Software, developer of the popular game Doom, stated publically that he would refuse to use Direct3D and use OpenGL instead. Chris Hecker published a comprehensive analysis of the two APIs in the April-May 1997 issue of Game Developer magazine, concluding that Microsoft should simply discontinue Direct3D and put its efforts into OpenGL. It began to appear that Microsoft was using Direct3D to achieve market control and to limit innovation to areas that could not be used to challenge Microsoft, rather than to provide a technically superior product for its customers or to promote free competition between APIs. Two petitions were issued by game developers to Microsoft. The first, from 56 top game developers, called for Microsoft to release OpenGL MCD device drivers and other work that it had completed, but not released because it would allow OpenGL to compete with Direct3D. A second open letter to Microsoft on the same subject gathered 254 signatures initially and over 1400 by the time the letter was closed; the comments offered by some signatories are particularly interesting. Microsoft's reply was to reiterate its old market positioning statement that Direct3D was for high-volume high-performance applications, and OpenGL was for high-precision computer-aided-design applications only. Although the petitioners made it clear that they wanted the two APIs on an equal footing, so that competition spurred innovation and so that no single party controlled access to the graphics hardware, Microsoft responded by increasing its investment in Direct3D and reducing its investment in OpenGL even further. To this author's knowledge, Microsoft never issued a press release to acknowledge the petitions. August, 1998 UpdateAbout the time the original version of this article was written, Jon Peddie Associates published a cogent editorial summarizing the situation. Its assessment of Microsoft's strengths, weaknesses, and behavior is worth reading even today. Microsoft continues to update Direct3D, and with each new revision incorporates more features from OpenGL. The overlap is not yet complete; for example, Direct3D still lacks the ability to handle curved surfaces, and it doesn't support graphics cards with geometry acceleration hardware. Nevertheless, Direct3D is much more capable than it was a year ago, and its evolution is beginning to diverge from that of OpenGL. FahrenheitThe most significant development over the past year is the Fahrenheit project. Silicon Graphics, dependent on Microsoft for most of the software required by its upcoming Visual PC product and concerned about technical problems with the DirectModel graphics API announced by Microsoft and Hewlett-Packard, elected to negotiate a compromise. Silicon Graphics, Microsoft, Intel, and Hewlett-Packard created a joint engineering project codenamed Fahrenheit to produce three new APIs. Two of these are beyond the scope of the current discussion, but the third (Fahrenheit Low-Level, or FLL) is usually touted as the resolution of the conflict between OpenGL and Direct3D. What is the Fahrenheit Low-Level API? At the moment no one knows what FLL will be, since no specification exists, but public statements are remarkably consistent in a few respects. From Silicon Graphics' original press release:
From the August, 1998 Microsoft Developer Network DirectX chat session:
From slides presented at Microsoft's 1998 Meltdown Conference:
In other words, the "new'' Fahrenheit Low-Level API is simply Direct3D, plus whatever additional features seem needed to match the functionality in OpenGL. With the competitive threat from OpenGL neutralized, Microsoft can proceed with business as usual. Should Fahrenheit be regarded as a welcome sign of Microsoft's responsiveness to its customers and partners? After all, there will now be just one API, relieving a considerable burden on hardware and software developers and simplifying life for consumers. I would argue the answer is no. The new APIs are entirely Microsoft-proprietary; Microsoft is now the bottleneck for all significant innovation in the computer graphics industry. Furthermore, it must be remembered that the entire conflict was of Microsoft's creation: Without Direct3D, the industry would have arrived at this point literally years ago, with greater opportunity for competition and without the constraint on innovation that Fahrenheit represents today. Status of OpenGLOpenGL is still the only real alternative to complete control of 3D graphics by Microsoft. It remains viable, though Silicon Graphics no longer promotes it in any manner unacceptable to Microsoft, so it is at much greater risk. Game developers are an independent-minded lot, and several important ones are still using OpenGL. As a result, hardware vendors are working to improve their support for it. Direct3D is not yet capable of handling high-end graphics hardware and most professional applications; OpenGL dominates those niches. Finally, the Open Source community (notably the Mesa project) is scrambling to provide OpenGL support for computers of any type, whether or not they use one of Microsoft's operating systems. Conclusions3D graphics is a valuable case study for students of Microsoft. In the course of this campaign for control of a new market, Microsoft consistently:
In large measure, it appears that these tactics have succeeded. Microsoft often suggests that it provides only benefit to consumers, that the standardization it enforces is well worth the cost of ceding it complete control over the computer products those consumers use for work and entertainment. In this case, it is clear that standardization for the consumer's benefit was not Microsoft's goal; a superior standard product existed, but Microsoft systematically suppressed it in order to establish its own product and gain control of a new market in which Microsoft previously had no presence. Microsoft did this, and could do this, only because it enjoyed overwhelming dominance as an operating-system supplier. As a result, product features that might have been available to consumers up to two years ago are only now becoming common. In the meantime product development costs for both hardware and software vendors have been increased significantly by the need to deal with two competing standards. 3D graphics application software development has been inhibited by an artificially-created fragmentation of the market into incompatible "consumer'' (Windows 95) and "professional'' (Windows NT) segments. Is this not precisely the kind of situation for which an antitrust action is the appropriate response? Allen Akin is an independent software developer in Palo Alto, California. Appendix: John Carmack's Discussion of OpenGL and Direct3D(Included here because it may be difficult to locate a copy online.) [idsoftware.com] Login name: johnc In real life: John Carmack Directory: /raid/nardo/johnc Shell: /bin/csh On since Dec 15 01:19:05 6 days 5 hours Idle Time on ttyp2 from idnewt On since Dec 17 01:05:12 4 days 23 hours Idle Time on ttyp3 from idcarmack Plan: I am going to use this installment of my .plan file to get up on a soapbox about an important issue to me: 3D API. I get asked for my opinions about this often enough that it is time I just made a public statement. So here it is, my current position as of december '96... While the rest of Id works on Quake 2, most of my effort is now focused on developing the next generation of game technology. This new generation of technology will be used by Id and other companies all the way through the year 2000, so there are some very important long term decisions to be made. There are two viable contenders for low level 3D programming on win32: Direct-3D Immediate Mode, the new, designed for games API, and OpenGL, the workstation graphics API originally developed by SGI. They are both supported by microsoft, but D3D has been evangelized as the one true solution for games. I have been using OpenGL for about six months now, and I have been very impressed by the design of the API, and especially it's ease of use. A month ago, I ported quake to OpenGL. It was an extremely pleasant experience. It didn't take long, the code was clean and simple, and it gave me a great testbed to rapidly try out new research ideas. I started porting glquake to Direct-3D IM with the intent of learning the api and doing a fair comparison. Well, I have learned enough about it. I'm not going to finish the port. I have better things to do with my time. I am hoping that the vendors shipping second generation cards in the coming year can be convinced to support OpenGL. If this doesn't happen early on and there are capable cards that glquake does not run on, then I apologize, but I am taking a little stand in my little corner of the world with the hope of having some small influence on things that are going to effect us for many years to come. Direct-3D IM is a horribly broken API. It inflicts great pain and suffering on the programmers using it, without returning any significant advantages. I don't think there is ANY market segment that D3D is appropriate for, OpenGL seems to work just fine for everything from quake to softimage. There is no good technical reason for the existence of D3D. I'm sure D3D will suck less with each forthcoming version, but this is an opportunity to just bypass dragging the entire development community through the messy evolution of an ill-birthed API. Best case: Microsoft integrates OpenGL with direct-x (probably calling it Direct-GL or something), ports D3D retained mode on top of GL, and tells everyone to forget they every heard of D3D immediate mode. Programmers have one good api, vendors have one driver to write, and the world is a better place. To elaborate a bit: "OpenGL" is either OpenGL 1.1 or OpenGL 1.0 with the common extensions. Raw OpenGL 1.0 has several holes in functionality. "D3D" is Direct-3D Immediate Mode. D3D retained mode is a separate issue. Retained mode has very valid reasons for existence. It is a good thing to have an api that lets you just load in model files and fly around without sweating the polygon details. Retained mode is going to be used by at least ten times as many programmers as immediate mode. On the other hand, the world class applications that really step to new levels are going to be done in an immediate mode graphics API. D3D-RM doesn't even really have to be tied to D3D-IM. It could be implemented to emit OpenGL code instead. I don't particularly care about the software only implementations of either D3D or OpenGL. I haven't done serious research here, but I think D3D has a real edge, because it was originally designed for software rendering and much optimization effort has been focused there. COSMO GL is attempting to compete there, but I feel the effort is misguided. Software rasterizers will still exist to support the lowest common denominator, but soon all game development will be targeted at hardware rasterization, so that's where effort should be focused. The primary importance of a 3D API to game developers is as an interface to the wide variety of 3D hardware that is emerging. If there was one compatible line of hardware that did what we wanted and covered 90+ percent of the target market, I wouldn't even want a 3D API for production use, I would be writing straight to the metal, just like I always have with pure software schemes. I would still want a 3D API for research and tool development, but it wouldn't matter if it wasn't a mainstream solution. Because I am expecting the 3D accelerator market to be fairly fragmented for the foreseeable future, I need an API to write to, with individual drivers for each brand of hardware. OpenGL has been maturing in the workstation market for many years now, always with a hardware focus. We have existing proof that it scales just great from a $300 permedia card all the way to a $250,000 loaded infinite reality system. All of the game oriented PC 3D hardware basically came into existence in the last year. Because of the frantic nature of the PC world, we may be getting stuck with a first guess API and driver model which isn't all that good. The things that matter with an API are: functionality, performance, driver coverage, and ease of use. Both APIs cover the important functionality. There shouldn't be any real argument about that. GL supports some additional esoteric features that I am unlikely to use (or are unlikely to be supported by hardware -- same effect). D3D actually has a couple nice features that I would like to see moved to GL (specular blend at each vertex, color key transparency, and no clipping hints), which brings up the extensions issue. GL can be extended by the driver, but because D3D imposes a layer between the driver and the API, microsoft is the only one that can extend D3D. My conclusion about performance is that there is not going to be any significant performance difference (< 10%) between properly written OpenGL and D3D drivers for several years at least. There are some arguments that gl will scale better to very high end hardware because it doesn't need to build any intermediate structures, but you could use tiny sub cache sized execute buffers in d3d and achieve reasonably similar results (or build complex hardware just to suit D3D -- ack!). There are also arguments from the other side that the vertex pools in d3d will save work on geometry bound applications, but you can do the same thing with vertex arrays in GL. Currently, there are more drivers available for D3D than OpenGL on the consumer level boards. I hope we can change this. A serious problem is that there are no D3D conformance tests, and the documentation is very poor, so the existing drivers aren't exactly uniform in their functionality. OpenGL has an established set of conformance tests, so there is no argument about exactly how things are supposed to work. OpenGL offers two levels of drivers that can be written: mini client drivers and installable client drivers. A MCD is a simple, robust exporting of hardware rasterization capabilities. An ICD is basically a full replacement for the API that lets hardware accelerate or extend any piece of GL without any overhead. The overriding reason why GL is so much better than D3D has to do with ease of use. GL is easy to use and fun to experiment with. D3D is not (ahem). You can make sample GL programs with a single page of code. I think D3D has managed to make the worst possible interface choice at every opportunity. COM. Expandable structs passed to functions. Execute buffers. Some of these choices were made so that the API would be able to gracefully expand in the future, but who cares about having an API that can grow if you have forced it to be painful to use now and forever after? Many things that are a single line of GL code require half a page of D3D code to allocate a structure, set a size, fill something in, call a COM routine, then extract the result. Ease of use is damn important. If you can program something in half the time, you can ship earlier or explore more approaches. A clean, readable coding interface also makes it easier to find / prevent bugs. GL's interface is procedural: You perform operations by calling gl functions to pass vertex data and specify primitives. glBegin (GL_TRIANGLES); glVertex (0,0,0); glVertex (1,1,0); glVertex (2,0,0); glEnd (); D3D's interface is by execute buffers: You build a structure containing vertex data and commands, and pass the entire thing with a single call. On the surface, this appears to be an efficiency improvement for D3D, because it gets rid of a lot of procedure call overhead. In reality, it is a gigantic pain-in-the-ass. (pseudo code, and incomplete) v = &buffer.vertexes;[0]; v->x = 0; v->y = 0; v->z = 0; v++; v->x = 1; v->y = 1; v->z = 0; v++; v->x = 2; v->y = 0; v->z = 0; c = &buffer.commands; c->operation = DRAW_TRIANGLE; c->vertexes[0] = 0; c->vertexes[1] = 1; c->vertexes[2] = 2; IssueExecuteBuffer (buffer); If I included the complete code to actually lock, build, and issue an execute buffer here, you would think I was choosing some pathologically slanted case to make D3D look bad. You wouldn't actually make an execute buffer with a single triangle in it, or your performance would be dreadful. The idea is to build up a large batch of commands so that you pass lots of work to D3D with a single procedure call. A problem with that is that the optimal definition of "large" and "lots" varies depending on what hardware you are using, but instead of leaving that up to the driver, the application programmer has to know what is best for every hardware situation. You can cover some of the messy work with macros, but that brings its own set of problems. The only way I can see to make D3D generally usable is to create your own procedural interface that buffers commands up into one or more execute buffers and flushes when needed. But why bother, when there is this other nifty procedural API already there... With OpenGL, you can get something working with simple, straightforward code, then if it is warranted, you can convert to display lists or vertex arrays for max performance (although the difference usually isn't that large). This is the right way of doing things -- like converting your crucial functions to assembly language after doing all your development in C. With D3D, you have to do everything the painful way from the beginning. Like writing a complete program in assembly language, taking many times longer, missing chances for algorithmic improvements, etc. And then finding out it doesn't even go faster. I am going to be programming with a 3D API every day for many years to come. I want something that helps me, rather than gets in my way. John Carmack Id Software |
|
|
|
|
|
|
|
last revised: 30 August 1998 |