Skip navigation.

exploreopera

| Help

Sign up | Help

Posts tagged with "Xorg"

nVidia (driver) strikes back yet another time

, , , ...

As usual, I'm here again to blog about... guess what? Problems with nvidia driver.

Right now I have an Asus M51Sn notebook, which has nvidia GeForce 9500M GS card. I'm using 64-bit Gentoo/Linux with vanilla kernel 2.6.25.5. At the time I installed it, the 2.6.25.x versions were not marked as stable in Gentoo, but I chose to install it anyway. However, the then stable x11-drivers/nvidia-drivers-169.09-r1 did not compile with those kernel versions (bug 218178).

Solution? Well, I just installed the next version (x11-drivers/nvidia-drivers-169.12) and it worked pretty well.

In fact, it worked very well until this week, when x11-drivers/nvidia-drivers-173.14.09 was marked stable (but 169.12 was still ~testing). After this update, my 3D desktop simply stopped working. Well, it starts, I can start one app or two, but things are frozen or almost frozen in a weird way that it is simply not usable.

Solution? I'm going back to 169.12 version.

This comment on bug 218178 was a bit scary for me:
Which is why you use the latest stable version, 173.14.09 which is designed for
2.6.25.

What does that mean? Does that mean that 169.x series were not designed for 2.6.25? Or just that 169.09 was not designed, but 169.12 is ok (because it was left "undefined" in his comment)?

Well, anyway I'm masking 173.14.09. I'm going to stick with 169.12, at least this version works for me... until the next nvidia strike.

nVidia (driver) strikes back (again)

, , , ...

This is not the first time I post here a problem related to nVidia binary driver on my system. Neither this is the second time. Neither the third... Well, I could even rename my blog to "Crazy nVidia bugs".

Recently, nvidia-drivers-100.14.09 has been marked stable in Gentoo. I looked at that and thought to myself: "Wtf? What is this weird version number? Hum, maybe a whole new development branch? Maybe a code rewrite? I don't like the way it looks, I predict it might be buggy."

Unfortunately, it is buggy.

The new bug is: when exiting X and returning to text-mode console, sometimes the screen goes black. When this happens, I can't see what's written on console, but I still can use it (I could hear some beeps when pressing some keys, and I managed to startx again). I tried to ssh into this machine, and I found nothing strange, no process was eating lots of CPU (like what happens with that X.Org-freezing bug, described in many previouos posts here).

I've added a bug report at Gentoo Bugzilla. It is bug #186596.

I've also sent some e-mails to nVidia Linux bug report address. I got a very quick response telling me that 100.14.09 version is no longer supported, and asking me to test 100.14.11. I did, and the bug is present in both versions.

If you wanna keep track of this issue, watch bug #186596. But, for now, I'm masking 100.14.09 and 100.14.11 versions and I'm going to use 1.0.9639 version.

NVIDIA = No VIDeo for vIA

, , , ...

If you read this blog, you might remember how many times I tried to make nvidia driver work with X.org, without freezing it. The solution was disable AGP support and wait until X.org 7 was released, since I read somewhere that nVidia sent some patches to it.

7 months since my previous try, now I have modular X.org 7.0 and Gentoo "told me" to update my nvidia drivers. It is a good time to try again.

Versions: Modular X.org 7.0.0, nvidia-drivers-1.0.8762-r1, vanilla-kernel-2.6.17.6
Hardware: Pentium III 800MHz, Asus CUV4X motherboard (with Via chipset), GeForce FX 5500 videocard

I edited my xorg.conf file and commented the following line:
#Option      "NvAGP"        "0"
This way, the AGP setting was "automatic". After I started X, module via_agp has been loaded, and /proc/driver/nvidia/agp/status told me the AGP was enabled as 4X (my video card supports 8X, but my mobo only supports 4X).

For some time, everything was fine. I noticed no speed-up by having AGP enabled. If there was some speed-up, it was minimal.

Then, suddenly, X froze. Exactly the same symptoms as before: keyboard dooes not work, nothing work, but mouse cursor still moves on screen. Although I could't check that, people say the system is still up and running, and, when opening an ssh session, we can see that X process is taking almost 100% of CPU.

Well, no ssh for me, no way to check or kill X process. So I did Alt+SysRq+K (see footnote), which killed X and returned me to a plain text console (I don't use framebuffer, bootsplash or similar). Fortunately, the console was a pure and working text console. I remember some other times I was forced to do that, the monitor display was still graphic and displaying completely garbled pixels, even though the console was "working" (I could type commands, but could not see what was printed).

Looking at dmesg output, I can see some very familiar lines:
agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode
NVRM: Xid (0001:00): 6, PE0000 1ffc 00000000 0000f74c 0000ffff 00000000
SysRq : SAK
That NVRM: Xid... line was always present everytime X froze (maybe with other values). Looking at this blog archive, you might find that line on other posts.

Conclusion: X.org 7 did not fix this issue (as I thought it would). In fact, I don't even know from where this issue is: agpgart module, via_agp module, nvidia module, X.org or even Via hardware.

Solution: Put Option "NvAGP" "0" line at your xorg.conf, near Driver "nvidia" line. If you run cat /proc/driver/nvidia/agp/status, it will print Status: Disabled, because AGP support will be disabled. This causes no noticeable slowdowns, and everything else will still be working fine, including 3D OpenGL programs and games. And, at least, X won't be crashing and you will have a stable system again.

Oh, one last advice: I've tested if that (wrong) behavior described at nVidia versus fonts! has been changed, and I found it is still the same behavior. So, all information on that old post is still valid.

Footnote about SAK/SysRq: Alt+SysRq+K (SysRq is the same key of Print Screen) combination is trapped by kernel and does mean Secure Access Key (SAK). It will kill all programs on the current virtual console. To enable that, you might want to recompile your kernel with Magic SysRq key, or modify your keyboard mapping. Read more at /usr/src/linux/Documentation/sysrq.txt and /usr/src/linux/Documentation/SAK.txt.

edit: A friend told me he had the same problem. He has AMD Semprom 2200+ on Abit VA-10 motherboard (Via chipset) and GeForce 4 MX 440 64 MB.

nVidia versus fonts!

, , , ...

Well, looks like disabling AGP resulted in a stable system. No crashes so far (since yesterday).

However, as soon as I started X with nvidia driver, I could notice something wrong... Fonts of most applications were smaller! Both Psi (QT application) and X-Chat (GTK application) displayed fonts way too small. So, I started hunting the cause of this...

When using nv driver, it correctly autodetects my screen size (280x210mm) and sets DPI to 92x92. However, when running nvidia driver, it sets DPI to 72x72 and screen size to 361x271mm. You can see these values by running xdpyinfo | grep -B1 dot (as described at nVidia driver README). You can also see the screen size (but not DPI) using xrandr command.

Well, the first thought was to set the screen size by hand, in /etc/X11/xorg.conf. So I added DisplaySize 280 210 to the "Monitor" section. Then I restarted X and... no change. The xdpyinfo still showed the same DPI and resolution.

Conclusion 1: nvidia driver ignores DisplaySize setting.

Asking at #nvidia, people told me set DPI. So I tried it.

There are a couple of ways of doing it. One of them is using the -dpi command-line parameter. Note that:
startx -dpi 75x75
does not work. You must use:
startx -- -dpi 75x75
There is a reason for that (read the startx manpage to learn). Only parameters after -- are passed to X server (the others are passed to client, and also define it). The manpage has three examples:
startx -- -depth 16
startx -- -dpi 100
startx -- -layout Multihead

Well, back on-topic, I tried to run startx using the DPI parameter. It worked, the fonts are now at their normal size. In addition, it also changed the screen dimensions.

Conclusion 2: Setting the correct DPI solves the font-size problems. (as Chapter 5 - Common Problems explains)

To avoid passing -dpi parameter all the time, I added the following lines to "Device" section corresponding to my videocard with nvidia driver:
Option "UseEdidDpi"   "false"
Option "Dpi"          "92 x 92"
Not setting the "UseEdidDpi" option causes nvidia driver ignore the "Dpi" option and set the DPI based on some EDID info, or based on telepathic powers.

After those tries (a few more than listed here), and after reading the Appendix Y - Dots Per Inch, I finally understood how it works:

Conclusion 3: nvidia calculates the screen dimensions based on DPI and resolution. (1024/92dpi = 283mm ; 768/92dpi = 212mm)

So, if you want to display fonts at the correct size on your screen, DON'T waste time setting "DisplaySize". Instead, measure your monitor width (only the viewable area), convert that to inches, and divide your resolution by it. Set this value as "Dpi" option, and don't forget to set "UseEdidDpi" to "false". Thanks nVidia for this.

This is completely wrong, in my opinion. Following the way nvidia behaves, the monitor dimensions are the variable to be calculated. In other words, according to nvidia driver, your monitor size is not constant. This is stupid. If I change the resolution, the DPI keeps constant, and my monitor screen that became smaller. It is like my monitor must shrink to fit nvidia calculations.

Read the following two excerpts from Appendix Y:
If the display device provides an EDID, and the EDID contains information about the physical size of the display device, that is used to compute the DPI. [...]

Note that the physical size of the X screen, as reported through `xdpyinfo` is computed based on the DPI and the size of the X screen in pixels.

As I can understand, nvidia driver gets the physical size from monitor, uses it to calculate DPI, then use DPI and screen resolution to calculate the physical size. I can't understand why nVidia chose to do it. Makes no sense for me.

It is stupid, but true.
October 2008
SMTWTFS
September 2008November 2008
1234
567891011
12131415161718
19202122232425
262728293031