Chapter 6. Frequently Asked Questions

This section provides answers to frequently asked questions associated with the NVIDIA SunOS x86 Driver and its installation. Common problem diagnoses can be found in Chapter 7, Common Problems and tips for new users can be found in Appendix G, Tips for New SunOS Users. Also, detailed information for specific setups is provided in the Appendices.

6.1. NVIDIA Driver

Where should I start when diagnosing display problems?

One of the most useful tools for diagnosing problems is the X log file in /var/log. Lines that begin with (II) are information, (WW) are warnings, and (EE) are errors. You should make sure that the correct config file (i.e. the config file you are editing) is being used; look for the line that begins with:

    (==) Using config file:

Also make sure that the NVIDIA driver is being used, rather than another driver. Search for

    (II) LoadModule: "nvidia"

Lines from the driver should begin with:

    (II) NVIDIA(0)

How can I increase the amount of data printed in the X log file?

By default, the NVIDIA X driver prints relatively few messages to stderr and the X log file. If you need to troubleshoot, then it may be helpful to enable more verbose output by using the X command line options -verbose and -logverbose, which can be used to set the verbosity level for the stderr and log file messages, respectively. The NVIDIA X driver will output more messages when the verbosity level is at or above 5 (X defaults to verbosity level 1 for stderr and level 3 for the log file). So, to enable verbose messaging from the NVIDIA X driver to the X server log file, you could start X with the verbosity level set to 5, by doing the following:

  • Become root.

  • If not done yet, copy your configuration file for the X server from /usr/dt/config to /etc/dt/config:

    # cp /usr/dt/config/Xservers /etc/dt/config/Xservers
    

  • At the end of the etc/dt/config/Xservers file, add the options '-logverbose 5' to the uncommented line(s) starting with the display number. For example:

      :0   Local local_uid@console root /usr/X11/bin/Xserver :0 -nobanner -logverbose 5
    

  • Restart the X server

How do I uninstall the NVIDIA Solaris Graphics driver ?

Two Solaris packages comprise the NVIDIA Solaris Graphics driver files. Both Solaris packages NVDAgraphicsr and NVDAgraphics need to be uninstalled. Remove the package NVDAgraphicsr first, then the package NVDAgraphics:

# pkgrm NVDAgraphicsr NVDAgraphics

Why does X use so much memory?

When measuring any application's memory usage, you must be careful to distinguish between physical system RAM used and virtual mappings of shared resources. For example, most shared libraries exist only once in physical memory but are mapped into multiple processes. This memory should only be counted once when computing total memory usage. In the same way, the video memory on a graphics card or register memory on any device can be mapped into multiple processes. These mappings do not consume normal system RAM.

This has been a frequently discussed topic on XFree86 mailing lists; see, for example:

http://marc.theaimsgroup.com/?l=xfree-xpert&m=96835767116567&w=2

The pmap utility is available in the directory /usr/proc/bin, and is a useful tool in distinguishing between types of memory mappings. For example, while prstat may indicate that X is using several hundred MB of memory, the last line of output from pmap -x:

        total Kb  337904  335884   53320       -

reveals that X is really only using roughly 53MB of system RAM (the "anon" value).

Note, also, that X must allocate resources on behalf of X clients (the window manager, your web browser, etc); the X server's memory usage will increase as more clients request resources such as pixmaps, and decrease as you close X applications.

Why do applications that use DGA graphics fail?

The NVIDIA driver does not support the graphics component of the XFree86-DGA (Direct Graphics Access) extension. Applications can use the XDGASelectInput() function to acquire relative pointer motion, but graphics-related functions such as XDGASetMode() and XDGAOpenFramebuffer() will fail.

The graphics component of XFree86-DGA is not supported because it requires a CPU mapping of framebuffer memory. As graphics cards ship with increasing quantities of video memory, the NVIDIA X driver has had to switch to a more dynamic memory mapping scheme that is incompatible with DGA. Furthermore, DGA does not cooperate with other graphics rendering libraries such as Xlib and OpenGL because it accesses GPU resources directly.

NVIDIA recommends that applications use OpenGL or Xlib, rather than DGA, for graphics rendering. Using rendering libraries other than DGA will yield better performance and improve interoperability with other X applications.

My kernel log contains messages that are prefixed with "Xid"; what do these messages mean?

"Xid" messages indicate that a general GPU error occurred, most often due to the driver misprogramming the GPU or to corruption of the commands sent to the GPU. These messages provide diagnostic information that can be used by NVIDIA to aid in debugging reported problems.

Some information on how to interpret Xid messages is available here: http://docs.nvidia.com/deploy/xid-errors/index.html

My kernel log contains the message "NVRM: Xid (...): 81, VGA Subsystem Error." How can I fix this?

In some extreme cases, the VGA console can hang if messages are printed to a legacy VGA text console concurrently with applications that generate high GPU memory traffic.

The solution to this problem is to not use a legacy VGA text console. Instead, on capable systems, use pure UEFI mode (not Compatibility Support Module (CSM)). On legacy SBIOS systems, use a framebuffer console such as vesafb.

I use the Coolbits overclocking interface to adjust my graphics card's clock frequencies, but the defaults are reset whenever X is restarted. How do I make my changes persistent?

Clock frequency settings are not saved/restored automatically by default to avoid potential stability and other problems that may be encountered if the chosen frequency settings differ from the defaults qualified by the manufacturer. You can add an nvidia-settings command to ~/.xinitrc to automatically apply custom clock frequency settings when the X server is started. See the nvidia-settings(1) manual page for more information on setting clock frequency settings on the command line.

Why is the refresh rate not reported correctly by utilities that use the XF86VidMode X extension and/or RandR X extension versions prior to 1.2 (e.g., `xrandr --q1`)?

These extensions are not aware of multiple display devices on a single X screen; they only see the MetaMode bounding box, which may contain one or more actual modes. This means that if multiple MetaModes have the same bounding box, these extensions will not be able to distinguish between them. In order to support dynamic display configuration, the NVIDIA X driver must make each MetaMode appear to be unique and accomplishes this by using the refresh rate as a unique identifier.

You can use `nvidia-settings -q RefreshRate` to query the actual refresh rate on each display device.

Why does starting certain applications result in Xlib error messages indicating extensions like "XFree86-VidModeExtension" or "SHAPE" are missing?

If your X config file has a Module section that does not list the "extmod" module, some X server extensions may be missing, resulting in error messages of the form:

Xlib: extension "SHAPE" missing on display ":0.0"
Xlib: extension "XFree86-VidModeExtension" missing on display ":0.0"
Xlib: extension "XFree86-DGA" missing on display ":0.0"

You can solve this problem by adding the line below to your X config file's Module section:

    Load "extmod"

Where can I find older driver versions?

Please visit https://download.nvidia.com/solaris/

What is the format of a PCI Bus ID?

Different tools have different formats for the PCI Bus ID of a PCI device.

The X server's "BusID" X configuration file option interprets the BusID string in the format "bus@domain:device:function" (the "@domain" portion is only needed if the PCI domain is non-zero), in decimal. More specifically,

"%d@%d:%d:%d", bus, domain, device, function

in printf(3) syntax. NVIDIA X driver logging, nvidia-xconfig, and nvidia-settings match the X configuration file BusID convention.

The scanpci(1) utility, in contrast, prints the PCI Bus ID in a space-separated format in hexadecimal. More specifically,

"pci bus 0x%04x cardnum 0x%02x function 0x%02x", bus, device, function

in printf(3) syntax.

How do I interpret X server version numbers?

X server version numbers can be difficult to interpret because some X.Org X servers report the versions of different things.

In 2003, X.Org created a fork of the XFree86 project's code base, which used a monolithic build system to build the X server, libraries, and applications together in one source code repository. It resumed the release version numbering where it left off in 2001, continuing with 6.7, 6.8, etc., for the releases of this large bundle of code. These version numbers are sometimes written X11R6.7, X11R6.8, etc. to include the version of the X protocol.

In 2005, an effort was made to split the monolithic code base into separate modules with their own version numbers to make them easier to maintain and so that they could be released independently. X.Org still occasionally releases these modules together, with a single version number. These releases are simply referred to as “X.Org releases”, or sometimes “katamari” releases. For example, X.Org 7.6 was released on December 20, 2010 and contains version 1.9.3 of the xorg-server package, which contains the core X server itself.

The release management changes from XFree86, to X.Org monolithic releases, to X.Org modular releases impacted the behavior of the X server's -version command line option. For example, XFree86 X servers always report the version of the XFree86 monolithic package:

XFree86 Version 4.3.0 (Red Hat Linux release: 4.3.0-2)
Release Date: 27 February 2003
X Protocol Version 11, Revision 0, Release 6.6

X servers in X.Org monolithic and early “katamari” releases did something similar:

X Window System Version 7.1.1
Release Date: 12 May 2006
X Protocol Version 11, Revision 0, Release 7.1.1

However, X.Org later modified the X server to start printing its individual module version number instead:

X.Org X Server 1.9.3
Release Date: 2010-12-13
X Protocol Version 11, Revision 0

Please keep this in mind when comparing X server versions: what looks like “version 7.x” is older than version 1.x.

Why doesn't the NVIDIA X driver make more display resolutions and refresh rates available via RandR?

Prior to the 302.* driver series, the list of modes reported to applications by the NVIDIA X driver was not limited to the list of modes natively supported by a display device. In order to expose the largest possible set of modes on digital flat panel displays, which typically do not accept arbitrary mode timings, the driver maintained separate sets of "front-end" and "back-end" mode timings, and scaled between them to simulate the availability of more modes than would otherwise be supported.

Front-end timings were the values reported to applications, and back-end timings were what was actually sent to the display. Both sets of timings went through the full mode validation process, with the back-end timings having the additional constraint that they must be provided by the display's EDID, as only EDID-provided modes can be safely assumed to be supported by the display hardware. Applications could request any available front-end timings, which the driver would implicitly scale to either the "best fit" or "native" mode timings. For example, an application might request an 800x600 @ 60 Hz mode and the driver would provide it, but the real mode sent to the display would be 1920x1080 @ 30 Hz. While the availability of modes beyond those natively supported by a display was convenient for some uses, it created several problems. For example:

  • The complete front-end timings were reported to applications, but only the width and height were actually used. This could cause confusion because in many cases, changing the front-end timings did not change the back-end timings. This was especially confusing when trying to change the refresh rate, because the refresh rate in the front-end timings was ignored, but was still reported to applications.

  • The front-end timings reported to the user could be different from the backend timings reported in the display device's on screen display, leading to user confusion. Finding out the back-end timings (e.g. to find the real refresh rate) required using the NVIDIA-specific NV-CONTROL X extension.

  • The process by which back-end timings were selected for use with any given front-end timings was not transparent to users, and this process could only be explicitly configured with NVIDIA-specific xorg.conf options or the NV-CONTROL X extension. Confusion over how changing front-end timings could affect the back-end timings was especially problematic in use cases that were sensitive to the timings the display device receives, such as NVIDIA 3D Vision.

  • User-specified modes underwent normal mode validation, even though the timings in those modes were not used. For example, a 1920x1080 @ 100 Hz mode might fail the VertRefresh check, even though the back-end timings might actually be 1920x1080 @ 30 Hz.

Version 1.2 of the X Resize and Rotate extension (henceforth referred to as "RandR 1.2") allows configuration of display scaling in a much more flexible and standardized way. The protocol allows applications to choose exactly which (back-end) mode timing is used, and exactly how the screen is scaled to fill that mode. It also allows explicit control over which displays are enabled, and which portions of the screen they display. This also provides much-needed transparency: the mode timings reported by RandR 1.2 are the actual mode timings being sent to the display. However, this means that only modes actually supported by the display are reported in the RandR 1.2 mode list. Scaling configurations, such as the 800x600 to 1920x1080 example above, need to be configured via the RandR 1.2 transform feature. Adding implicitly scaled modes to the mode list would conflict with the transform configuration options and reintroduce the same problems that the previous front-end/back-end timing system had.

With the introduction of RandR 1.2 support to the 302.* driver series, the front-end/back-end timing system was abandoned, and the list of mode timings exposed by the NVIDIA X driver was simplified to include only those modes which would actually be driven by the hardware. Although it remained possible to manually configure all of the scaling configurations that were previously possible, and many scaling configurations which were previously impossible, this change resulted in some inconvenient losses of functionality:

  • Applications which used RandR 1.1 or earlier or XF86VidMode to set modes no longer had the implicitly scaled front-end timings available to them. Many displays have EDIDs which advertise only the display's native resolution, or a list of resolutions that is otherwise small, compared to the list that would previously have been exposed as front-end timings, preventing these applications from setting modes that were possible with previous versions of the NVIDIA driver.

  • The nvidia-settings control panel, which formerly listed all available front-end modes for displays in its X Server Display Configuration page, only listed the actual back-end modes.

Subsequent driver releases restored some of this functionality without reverting to the front-end/back-end system:

  • The NVIDIA X driver now builds a list of "Implicit MetaModes", which implicitly scale many common resolutions to a mode that is supported by the display. These modes are exposed to applications which use RandR 1.1 and XF86VidMode, as neither supports the scaling or other transform capabilities of RandR 1.2.

  • The resolution list in the nvidia-settings X Server Display Configuration page now includes explicitly scaled modes for many common resolutions which are not directly supported by the display. To reduce confusion, the scaled modes are identified as being scaled, and it is not possible to set a refresh rate for any of the scaled modes.

As mentioned previously, the RandR 1.2 mode list contains only modes which are supported by the display. Modern applications that wish to set modes other than those available in the RandR 1.2 mode list are encouraged to use RandR 1.2 transformations to program any required scaling operations. For example, the xrandr utility can program RandR scaling transformations, and the following command can scale a 1280x720 mode to a display connected to output DVI-I-0 that does not support the desired mode, but does support 1920x1080:

xrandr --output DVI-I-0 --mode 1920x1080 --scale-from 1280x720