Wednesday, November 25, 2009

F5 BIGIP: Verify/Restart SNMP Daemon

Just in case you need to check the status and/or restart the SNMP daemon of the bigip (i.e., because it has stopped responding to SNMP polling), enter the following commands via the CLI:

For BIGIP v4
  1. Check the SNMP daemon status:
    /etc/bigstart/status/S40snmpd status

    The correct output should be:
    Status snmpd: (pid xxxxx) is running
    Status bigsnmpd: (pid yyyyy) is running
    Status rlxsnmpd: is not running


  2. If the result is different from above (i.e., bigsnmpd is not running), restart the SNMP daemon:
    /etc/bigstart/status/S40snmpd restart



For BIGIP v9
  1. Check the current status of the SNMP daemon:
    bigstart status snmpd


  2. Restart the SNMP daemon
    bigstart restart snmpd


  3. Verify status of the SNMP daemon:
    bigstart status snmpd


  4. Example:
    [root@bigip:Active]~# bigstart status snmpd
    snmpd run (pid 12707) 90 days, 1 start
    [root@bigip:Active]~# bigstart restart snmpd
    [root@bigip:Active]~# bigstart status snmpd
    snmpd run (pid 4822) 6 seconds, 2 starts
    [root@bigip:Active]~#

Wednesday, November 11, 2009

CatOS : SYS-2-MOD_TEMPSENSORFAIL flood from X6148A-GE-45AF

CSCsl37513
http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCsl37513

SYS-2-MOD_TEMPSENSORFAIL:Module w/ X6148A-GE-45AF and CatOS

Symptom:
Numerous WS-X6148 linecards generate the following error:
%SYS-2-MOD_TEMPSENSORFAIL:Module # temperature sensors failed, please %powercycle the module


Conditions:
No production impact related to this message.

Workaround:
Powercycle module as requested by the error message.
  • set module power down module_number
  • set module power up module_number

Permanent Fix:
Upgrade IOS/CatOS to the below versions or later:
8.7(0.22)FW124
8.7(1.62)LAR
8.6(5.7)
8.7(0.22)BUB48
12.2(33.3.13)SXH
12.2(33)SXH4

IOS: IP SLA : SNMP : Router crashes and reloads if up for more than 497 days

CSCsa57468
rttmon-mib does not return getnext value when queried via snmp


Symptom:
Concord poller crashes when polling a router that has been configured with IP SLA. Infact this DDTS will surface when doing snmp gets for the objects mentioned in the Conditions section below coming from any NMS (e.g. Concord, IPM, Spectrum, etc.)

Conditions:
The SNMP GETNEXT request is sent to the router for the following OIDs:
  • rttMonJitterStatsCompletions
  • rttMonStatsCaptureCompletions
  • rttMonStatsTotalsInitiations
  • rttMonStatsCaptureEntry (rttMonStatsCaptureCompletion etc.)
  • rttMonStatsCollectEntry
  • rttMonStatsTotalsEntry
  • rttMonJitterStatsEntry
  • rttMonHTTPStatsEntry.
The router does not return the next index of these OIDs, but the same index. This happens only when the router has been up and running for longer than 497 days.

Affected IOS Versions:
  • 12.2(15)T
  • 12.2SXH

Workaround:
This problem is only happening when polling the CISCO-RTTMON-MIB via snmp get. Use the IOS CLI to avoid it.

Permanent Fix:
Upgrade the IOS version.

Fixed in:
  • 12.3(14.12)M
  • 12.4(1.5)M
  • 12.2(33)SRC
  • 12.2(40)SE
  • 12.2(44)SE
  • 12.3(11)T6
  • 12.3(11)YW
  • 12.3(14)T2
  • 12.4(1.8)T
  • 12.4(1a)M
  • 12.2(33)SXI
  • 12.2(32.8.80)SR
  • 12.2(32.8.11)XID112.9
  • 12.2(33.1.7)SXH
  • 12.2(33)SXH2
  • 12.2(33)SB
  • 12.2(32.8.99a)SR133
  • 12.2(32.8.11)XJC153.1

Sunday, November 8, 2009

IOS: %SSH-3-PRIVATEKEY: Unable to retrieve RSA private key

Symptoms:
The device getting numerous %SSH-3-PRIVATEKEY syslogs, usually followed by a traceback such as the following:

Nov 7 02:40:49.542 GMT: %SSH-3-PRIVATEKEY: Unable to retrieve RSA private key for
-Process= "SSH Process", ipl= 0, pid= 148
-Traceback= 61D48360 61D44B24 61D462C4 6053BD88 6053BD6C
Nov 8 02:16:22.452 GMT: %SSH-3-PRIVATEKEY: Unable to retrieve RSA private key for
-Process= "SSH Process", ipl= 0, pid= 148
-Traceback= 61D48360 61D44B24 61D462C4 6053BD88 6053BD6C


Explanation:
Often seen if hostname or domain name of the router has been changed.

Workaround/Fix:

  • Remove existing RSA Key:
    crypto key zeroize rsa
  • Gnerate RSA key with the following commands:

    show crypto key mypubkey rsa
    crypto key gen rsa general-keys label label
    ip ssh rsa keypair-name label

    where label = unique label/identifier

Wednesday, November 4, 2009

Wireless: %DTL-1-ARP_POISON_DETECTED

CSCsm25943 Change label for %DTL-1-ARP_POISON_DETECTED message
http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCsm25943

Symptom:

A Wireless LAN Controller may emit a message similar to the following:

DTL-1-ARP_POISON_DETECTED: STA [00:01:02:0e:54:c4, 0.0.0.0] ARP (op 1) received with invalid SPA 192.168.1.152/TPA 192.168.0.206

However, when one peruses the entry in the Cisco Wireless LAN Controller System Message Guide, 4.2, for this message, he may find it to be misleading and bereft of useful information.

Conditions:

This message does not necessarily imply that any actual "ARP poisoning" (ARP spoofing) is going on. Rather, it is emitted whenever the following conditions pertain:

- WLAN is configured with DHCP Required
- A client, after associating on that WLAN, transmits an ARP message without first DHCPing

This may be normal behavior - for example, when the client is statically addressed, or when the client is holding a valid DHCP lease from a prior association.

The effect of this condition is that the client will be unable to send or receive any data traffic, until it DHCPs thru the WLC.

In more detail, here is how to interpret the example message above:

DTL-1-ARP_POISON_DETECTED: STA [00:01:02:0e:54:c4, 0.0.0.0] ARP (op 1) received with invalid SPA 192.168.1.152/TPA 192.168.0.206

DTL-1-ARP_POISON_DETECTED
- WLC received an ARP packet from a client in DHCP_REQ state

STA [00:01:02:0e:54:c4, 0.0.0.0]
- the client ("STA" - 802.11 wireless station) has a MAC address of 00:01:02:0e:54:c4, and an IP address unknown to the WLC ("0.0.0.0")

ARP (op 1)
- the offending packet received from client was an ARP request (opcode 1)

invalid SPA 192.168.1.152/TPA 192.168.0.206
- the source IP address (SPA - "sender protocol address") of the ARP request was 192.168.1.152
- the target IP address (TPA - "target protocol address") of the ARP request was 192.168.0.206

Workaround:

  1. figure out whether or not you want to force your wireless clients to DHCP first, after associating, before they can send IP packets.


  2. If no, then unconfigure DHCP required, and you won't get this problem.


  3. If yes, then configure all clients to use DHCP.


  4. If the client is configured for DHCP, but still sometimes sends IP packets after associating without re-DHCPing, then:

    • See if the client eventually does re-DHCP & if so doesn't suffer an unacceptable outage before re-DHCPing. If the outage before re-DHCPing is acceptable, then you can just ignore this message.


    • If the client never does re-DHCP after associating, then it will never be able to pass L3 traffic. So in that case, either figure out how to change the client's behavior so that it always does re-DHCP after associating, or else just accept that this client won't work in this application, or else reconsider your decision to use "DHCP required".



Further Problem Description:

If the source IP address (SPA) of the ARP is an APIPA address (i.e. one in 169.254.0.0 /16), then this may be indicative of the STA's attempting but failing to acquire an address via DHCP. In which case you may want to verify that your DHCP implementation works.

1st Found-In:
4.2(61.0)

Fixed-In:
7.0(63.0)

Wireless: %APF-3-USER_DEL_FAILED

Event : %APF-3-USER_DEL_FAILED: Unable to delete username unknown for mobile mac-address

Explanation: This error can mean slightly different things depending on EAP method. Basically it is a side effect of an EAP method with identity protection.

EAP authentication is done in two phases. The first phase of authentication uses generic anonymous external identity in order to establish the tunnel. In phase 2, client authentication is done in the established tunnel. The client sends the original username and password to authenticate and establish a client authorization policy. As this authentication method hides the original user name at the first phase of authentication, the controller does not have a way to add the correct username to the authenticated user list. So the controller uses the anonymous username. The end result generates this error.

Further details on the related bug below:



%APF-1-USER_DEL_FAILED: apf_ms.c:5055 flooding msglogs.
http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCsz51403

Symptom:
The "%APF-1-USER_DEL_FAILED: apf_ms.c:5055" message floods msglogs

Conditions:
1. Multiple clients connect to the controller with the same user name, or
2. AAA server returns a user name that is different to what is registered by the client.

Workaround:
No, but it does not affect any controller feature

1st Found-In
  • 5.2(178.12)
  • 5.2(178.13)

    Fixed-In
  • 6.0(176.0)
  • 5.2(186.0)
  • 6.1(34.0)
  • 6.0(182.0)
  • 4.2(205.1)
  • 5.2(193.0)
  • 4.2(207.0)