Home > Extended Error > Extended Error Code Ecc Chipkill
Extended Error Code Ecc Chipkill
Click Here to receive this Complete Guide absolutely free. Register If you are a new customer, register now for access to product evaluations and purchasing capabilities. Doug Thompson 2007-01-19 19:34:52 UTC PermalinkRaw Message Post by Orion PoplawskiPost by Robert HancockPost by Orion PoplawskiCan someone please explain to me what these mean?EDAC k8 MC1: general bus error: participating share|improve this answer answered Jun 1 '09 at 20:51 Josh 1139 Ah, that's awesome! http://bashprofile.net/extended-error/extended-error-code-0xf004.html
By way of example, I had to identify a bad DIMM in a Linux server with 16 fully populated DIMM slots and two CPUs. Thnks in advance narayanapalla View Public Profile View LQ Blog View Review Entries View HCL Entries Find More Posts by narayanapalla 04-15-2010, 03:15 PM #2 AlucardZero Senior Member Explore Labs Configuration Deployment Troubleshooting Security Additional Tools Red Hat Access plug-ins Red Hat Satellite Certificate Tool Red Hat Insights Increase visibility into IT operations to detect and resolve technical issues Guessing that MC1 isthe controller on the second CPU. http://serverfault.com/questions/5672/ecc-chipkill-errors-which-dimm
Learn more about Red Hat subscriptions Product(s) Red Hat Enterprise Linux Category Troubleshoot Tags ecc hardware pki rhel rhel_4 rhel_5 rhel_6 Quick Links Downloads Subscriptions Support Cases Customer Service Product Documentation Was the information on this page helpful? Not much on the internet :( –markdrayton Jun 9 '09 at 9:04 I've not run into that issue either. No red LEDs on memory DIMMs.
- But as I said before, we need to have better mapping but I'll have to have some free time first to be able to do it :) -- Regards/Gruss, Boris. --
- How would they learn astronomy, those who don't see the stars?
- So either replace it all with brand new memory, or wait until it totally fails.
- Are you new to LinuxQuestions.org?
- If you have any questions, please contact customer service.
- And is it possible to reconfigure this bad memory modules?please let me know bcoz it will help me allot.
- How to handle a senior developer diva who seems unaware that his skills are obsolete?
hi, yes correct this may be a memory module problem. unSpawn View Public Profile View LQ Blog View Review Entries View HCL Entries Find More Posts by unSpawn View Blog 04-16-2010, 04:33 AM #5 narayanapalla LQ Newbie Registered: Jan If you need to reset your password, click here. Main Menu LQ Calendar LQ Rules LQ Sitemap Site FAQ View New Posts View Latest Posts Zero Reply Threads LQ Wiki Most Wanted Jeremy's Blog Report LQ Bug Syndicate Latest
Can any kernel or hardware gurus out there let me know if the error messages above allow me to locate the potentially bad memory stick? Is there a cunning way to work out which DIMM's bust while the server is up? Is the mass of an individual star almost constant throughout its life? Can Communism become a stable economic strategy?
Monitoring If you're interested in monitoring these failures and setting thresholds you might want to take a look at the mcelog package. Password Linux - Hardware This forum is for Hardware issues. share|improve this answer edited Sep 21 '13 at 1:03 answered Sep 21 '13 at 0:55 slm♦ 165k40304474 add a comment| Not the answer you're looking for? well, you seem to have a k8 system which means a single DRAM controller with two channels.
You may be able to figure out from this info whatDIMM is having the problem.That was my assumption as well, but was hoping someone could decode theabove information and point me http://www.gossamer-threads.com/lists/linux/kernel/1207815 so,our task is to investigate why these kind of errors messages were generating continuously. How do I explain that this is a terrible idea? LinuxQuestions.org > Forums > Linux Forums > Linux - Hardware inside /var/log/messages reporting these errors constantly User Name Remember Me?
Example: hpasmcli -s "show dimm" DIMM Configuration ------------------ Cartridge #: 0 Module #: 1 Present: Yes Form Factor: 9h Memory Type: 13h Size: 1024 MB Speed: 667 MHz Status: Ok Cartridge Check This Out how to resolve this bad memory modules? Run memtest on your machine. A little quicker than analyzing EDAC.
Browse other questions tagged linux hardware memory ecc or ask your own question. You obviously REPLACE THE BAD MEMORY. Memory Device Array Handle: 0x002B Error Information Handle: Not Provided Total Width: 72 bits Data Width: 64 bits Size: 4096 MB Form Factor: DIMM Set: None Locator: DIMMA0 Bank Locator: CPU0 http://bashprofile.net/extended-error/extended-error-code-0x3f5.html Not sure it is related to any defected piece of the hardware or totally not related to Server detail:Red Hat Enterprise Linux ES release 4 (Nahant Update 6) [[email protected] log]# uname
Index | Next | Previous | Print Thread | View Threaded jpiszcz at lucidpixels Mar29,2010,6:40AM Post #1 of 5 (2494 views) Permalink EDAC: Is it possible to calculate which piece of memory is
How do I help minimize interruptions during group meetings as a student? Search this Thread 04-15-2010, 02:28 PM #1 narayanapalla LQ Newbie Registered: Jan 2010 Posts: 8 Rep: inside /var/log/messages reporting these errors constantly Hi, The folling error messages will continuoesly What DIMMs are you using, by the way (exact part number)? Are independent variables really independent?
Sometimes other errors can cause whatlooks like a memory error, but usually a bad memory DIMM is the rootcause of the vast majority of such errors.In addition, memtest86+ doesn't find all Guessing that MC1 isthe controller on the second CPU. Want to know if that peripheral is compatible with Linux? have a peek here We Acted.
The setup of triggers and what it does are covered in this U&L question titled: Writing triggers for mcelog. Very helpful Somewhat helpful Not helpful End of content United StatesHewlett Packard Enterprise International Start of Country Selector content Select Your Country/Region and Language Click or use the tab key to asked 7 years ago viewed 29344 times active 6 years ago Related 0Which motherboards support ECC RAM and USB 3.0?0fb-dimm without ecc0Uncorrected DRAM ECC error4ECC errors in L3 cache - critical You need physical access.
They hope these examples will help you to get a better understanding of the Linux system and that you feel encouraged to try out things on your own. Is accuracy a binary? There was an error with your RAM. NOTE: The above-mentioned URL will take you to a non-HP Web site.
The mortgage company is trying to force us to make repairs after an insurance claim Does chilli get milder with cooking? Would row 1 be the second DIMM?No that would be the FIRST DIMM, on Channel 0Each DIMM has 2 ChipSelect Rows (CSROW)Each csrow covers two channels across, therefore on a 4 Issue "ECC chipkill x4 error" logged by EDAC in /var/log/messages, similar as below: kernel: EDAC k8 MC1: general bus error: participating processor(local node origin), time-out(no timeout) memory transaction type(generic read), mem