Fault¶
Introduction¶
The MIB2FCOM application in the SDK lets you generate an initial FCOM definition file from the provided MIB files. Please make sure that you have all of the required MIB dependencies before you proceed. The FCOM2Rules application lets you generate a foundation rules file from the presented FCOM file. The FCOM2Test application lets you generate a set of synthetic test traps to test the workflow of the curated FCOM. The following are the general steps to create an FCOM definition file and then generate a foundation rules file.
Convert the MIB File to FCOM¶
Run MIB2FCOM to convert the MIB file into an FCOM file. Application use is straightforward; you only need the input file and the name of the desired output file. Or, if desired you can pass in the name of a directory containing MIBs, and MIB2FCOM will run through all the MIBs in the directory and produce FCOM files for all the MIBs in the directory that contain traps.
Warning
Make sure you have all dependency MIBs at this stage as any missing MIBs will likely result in a conversion failure.
Usage Example (single MIB file):
$A1BASEDIR/bin/sdk/MIB2FCOM --in=ZXDUMIB.mib --out=ZXDUMIB-FCOM.json --use_parent_mibs
Usage Example (directory containing MIBs):
In the following example, the current working directory is a sub-directory of $A1BASEDIR/distrib/mibs which contains the NET-SNMP collection of MIBs.
$A1BASEDIR/bin/sdk/MIB2FCOM --in=. --out=NET-SNMP-FCOM.json --use_parent_mibs
Note
If there are many dependencies and you want to keep your working directories clean, you can put your temporary work files into a subdirectory of $A1BASEDIR/distrib/mibs/
and use the "--use_parent_mibs" command line option to use the subdirectory.
Curate the FCOM file¶
Open the FCOM file that you created in a text editor and start the curation process. See the FCOM Curation section for additional information.
Note
It is highly recommended to have the MIB files you are working on opened in a MIB browser during the curation process to use it as a contextual reference.
Generate the Foundation Rules File¶
Run FCOM2Rules to convert the FCOM file into foundation rules. The application prints to the standard output so you can visually verify the code before writing it to the file.
Usage example:
$A1BASEDIR/bin/sdk/FCOM2Rules ZXDUMIB-FCOM.json >> ZXDUMIB.foundationrules
Generate the Test File¶
Run FCOM2Test to generate the FCOM test file. The application prints to the standard output, which lets you verify the tests before writing to a file. The application will generate tests based on the "test/tests" fields from the FCOM.
Usage example:
$A1BASEDIR/bin/sdk/FCOM2Test ZXDUMIB-FCOM.json >> ZXDUMIB.test.sh
Note
Before running the tests, make sure that the following environmental variables are exported:
export BASEDIR=<Devices repository directory or directory containing MIBs>
export HOSTFQDN=<FQDN or IP of the target Assure1 server used for testing>
Upload the Foundation Rules File¶
-
Upload the generated foundation rules file into the Rules UI:
The file should be uploaded to the following location:
Core Rules (core) -> Default read-write branch (default) -> collection -> event -> trap -> custom
-
Restart the Event Trap Aggregator service.
-
Run the generated <Device>.test.sh file and verify the Events that are appearing on the Event List. Review the results. If necessary, go back through the FCOM curation process to correct any issues you found during your review process. Repeat until done.
FCOM Curation¶
Fields that you can modify are marked in green. Other fields that are not marked should not be modified during the curation process.
-
"@objectName": "JUNIPER-VPN-MIB::jnxVpnIfUp" – This field contains the name of the trap, in the format MIB-FILE::trapName.
-
"certification": "STANDARD" – This field is used by the product to identify the certification type of the trap.
-
"description": [] – This field contains the contextual description of the trap extracted from the MIB file. Pay attention to this field as the description can give helpful information about the nature of the trap, what the notification does, and potentially how it should be handled. This information should be used during the curation process to craft an understandable human description of the trap summary.
-
"domain": "FAULT" – This field is used by the product to identify telemetry domain.
-
"event": {} – Event object is a container for modifiable fields that will be populated upon event creation.
-
"EventCategory": 1 – This field indicates the EventCategory for the event in Assure1. Possible values are:
-
3 which means "Discrete" – Use this value if the coming event is neither a problem nor resolution, and should be used for any informational events.
-
2 which means "Problem" – Use this value if the coming event indicates a problem that can be "resolved" by another trap coming with different values. Use this field along with setting Severity to a value greater than 0. See the Severity description below.
-
1 which means "Resolution" – Use this value if the coming event indicates a resolution of a previous problem that was raised by any another trap coming with different values. Use this field along with setting Severity to a value of 0. See the Severity description below.
For this field, it is very common to use an eval statement along with ternary operators to set a different EventCategory based on the incoming trap values. For example, if $v1 field has three possible enumerations of 1: Major, 2: Clear, 3: Info, you can use an eval statement like this:
$v1==1 ? 2 : $v1==2 ? 1 : 3
In that example:
-
If $v1 has a value of 1, we set EventCategory to 2 (Problem).
-
If $v1 has a value of 2, we set EventCategory to 1 (Resolution).
-
In any other case, we set EventCategory to 3 (Discrete).
-
-
-
"EventType": "jnxVpnIf" – Original: "jnxVpnIfUp" This field is automatically populated by the trap object name itself, and upon initial generation it is unique among all traps. EventType is part of the Correlation Set which consists of:
[Device]+[SubNode]+[EventType]+[EventCategory]
For the correlation to work, the first three above fields must be identical throughout the correlated traps. In terms of curation, we are interested only in SubNode (see the description below) and EventType. Sometimes the correlation does not happen within the same trap (one varbind of few possible values like clear or major, see EventCategory notes), but along few different traps. By reading the full contextual descriptions of traps in the MIBs, the user should be able to identify separate traps that are intended to correlate between themselves.
Let's consider an example of two separate traps with the type: jnxVpnIfUp and jnxVpnIfDown. By reading the MIB descriptions of the traps, we know that the first is sent if the VPN interface is Up and the other one when the same interface is Down. To make the two traps correlate, we would need to make sure that the SubNode (see further notes) and EventType for those two traps are identical. In this example, the user would remove the suffix "Up" and "Down" from the EventTypes in both traps leaving only: jnxVpnIf. In this case, when one trap comes with severity > 0, it would raise the event and then correlate it when the second trap comes that clears this event.
-
"ExpireTime": 3600 – This field indicates default expiry time for the event if no action is taken by a user. It is common to set it to 1 hour (3600 seconds) for any Problem/Resolution pair events that correlate, and 24 hours (86,400 seconds) for any Discrete/Informational event.
-
"Severity": 0 – This field determines the severity of the event inserted into Assure1. Refer to the trap description and possible enums to determine the actual problem severity. Possible values are:
-
5 which means "Critical" – Use this severity for any critical system breaking problems. Generally, EventCategory should be set to 2.
-
4 which means "Major - Use this severity for any major problems that can affect production systems. Generally, EventCategory should be set to 2.
-
3 which means "Minor" - Use this severity for any minor and non-critical problems. Generally, EventCategory should be set to 2.
-
2 which means "Info" – Use this severity for any informational events that do not affect system operation in any significant way, e.g. "User logged in". Generally, EventCategory should be set to 3.
-
1 which means "Unknown" – This severity should not be used during trap curation unless absolutely necessary, unless indicated by the trap definition itself, or if the nature of trap needs additional custom work from the customer operations, e.g. the incoming event is a table of custom metric pre-set on EMS system by customer. In that case, the customer would have to create their own flow to process this very custom trap. Generally, EventCategory should be set to 3.
-
0 which means "Normal" – Use this severity for any clear/OK events. Generally, EventCategory should be set to 1.
-
-
"SubNode": "$v2-$v3" – This field is used as a unique identifier of the SubNode for the trap. For example, in many cases we are dealing with shelved network equipment. This equipment will be automatically identified by the Device field, but to identify any interfaces that are present on that shelf, you might need to use certain varbinds for the trap, if present. If $v2 would be a shelf number and $v3 an interface number, they would be ideal candidates for subnode identification. It is needed to distinguish between various interfaces on the same device for the correlation to work.
Warning
In this field, dynamic variables like Date/Time should be avoided, and variables that indicate unique interface naming should be used. Try to avoid spaces and special characters, keeping the string as simple as possible. If the specified device does not have any unique interfaces, it is ok to leave the default "device" string in that field.
-
"Summary": "VPN $v2, Type: $v1 is up. [IfIndex: $v3]" – This field represents the Summary text that will appear in Assure1 Events. It is important to use common sense when creating that string, making it as indicative and human-readable as possible, and avoid any unnecessary information. (All of the varbind values are populated in the events Details field for reference.) A good indication of event summary would be the textual description of the trap. If it is required to have different Summary texts for correlated events (like problem and resolution), you can use "eval" statements to branch out based on varbind values.
-
-
"metaData": {} – This object contains meta data information for the FCOM files.
- "certified": false – This field is used by Federos to certify Q/A'ed FCOM for production use.
-
"method": "trap" – This field is automatically populated. Internal system use.
-
"test": "$SNMPTRAPCMD JUNIPER-VPN-MIB::jnxVpnIfUp JUNIPER-VPN-MIB::jnxVpnIfVpnType i 1 JUNIPER-VPN-MIB::jnxVpnIfVpnName s EXAMPLE JUNIPER-VPN-MIB::jnxVpnIfIndex u 9999" – This field is automatically populated during FCOM generation and is used to create a "test trap" that can be sent in via the command line. Please bear in mind that it is purely synthetic and cannot substitute the real-life data.
-
"tests" (optional) – Instead of single test trap, you can use "tests" as an array of strings. It allows the operator to include more correlation tests within the set. During test generation additional tests will be automatically grouped and commented into sets within .sh file.
-
"trap" : {} – This object contains detailed information about trap structure and its varbinds.
-
"name": "JUNIPER-VPN-MIB::jnxVpnIfUp" – Name of the main trap, format MIB::trapName
-
"oid": "1.3.6.1.4.1.2636.3.26.0.1" – OID of the main trap.
-
"variables": [] – This field contains list of all the trap varbinds.
-
"description": [] – This field contains textual description of the varbind.
-
"enums": {} – If present in the MIB, this field will contain any possible enumeration of the varbind values that will be automatically used to resolve the values in the FCOM process.
-
"name": "JUNIPER-VPN-MIB::jnxVpnIfVpnType" – Name of the trap, format: MIB-FILE::varbindName
-
"oid": "1.3.6.1.4.1.2636.3.26.1.3.1.1" – OID of the trap.
-
"valueType": "INTEGER" – The type of the value the varbind carries.
-
-
Preprocessors¶
-
"preprocessors" : [] – This object stores any preprocessors used in the FCOM runtime. As the name indicates, all the preprocessors would be run first during runtime to "pre-process" any required variables. Currently available and implemented preprocessors are:
-
"regex" : { - This preprocessor is used to run Regex match pattern against any available variable and store it in capture group named variable for runtime use. Possible values are:
-
"value" : This field indicates the variable you would like to run regex against, for example $oid1.
-
"pattern" : This field is used to store pure regex pattern you want to run against the value field. To store a capture group for further use in any variable use the following syntax:
(?P<myCaptureVariable>)
If the regex match was successful for the variable specified within brackets, in our example myCaptureVariable, the variable will be available for the FCOM during runtime and can be accessed with Perl variable syntax: $myCaptureVariable. If the match was unsuccessful, the variable will be undefined.
-
"flags" : - This is an optional field, where you can specify regex flags you want to use during match. Possible values are g,I,m,s,u. If the field is not specified, regex will default to global g flag.
}
-
-
"lookup" : { - This preprocessor is used to lookup up any values against a custom hash lookup. Bear in mind that for the moment lookup table (Perl hash) has to be manually added by the user to the FCOM rules files (it is recommended to put a *.lookup file in the custom directory, and load it in base.load. In future it is planned to implement automatic lookup loading process.
-
"source" : - This variable indicates the name of the hash lookup table to check the key against.
-
"key" : - This variable indicates the variable we want to lookup against in the table.
-
"target" : - This field indicates the variable name that should store the value of successful lookup. The variable will be available for use during runtime.
}
-
-
"conversion" : { - This preprocessor is used to convert between different value types (integer, strings, etc).
-
"source" : - This field indicates the source variable you want to convert.
-
"target" : - This field indicates the target variable in which you want to store conversion result.
-
"type" : - Type of conversion we want to run, possible values are:
-
"StringToInt" – This conversion type will convert any String into a sum of its integer ASCII values.
-
"CharToInt" – This conversion type will convert only first character into its ASCII value. If source is a string only first character will be converted.
-
}
-
-
FCOM Runtime Variables¶
The following variables are exposed by Trapd for FCOM during runtime, and are available for the user to use in the curation process:
-
$v1 .. $v20 – These variables store raw varbind values for the trap. Varbind 1 would be $v1 and so on.
-
$oid1 .. $oid20 – These variables contain full OID of the specific varbind for the trap. Varbind 1 OID would be $oid1 and so on.
-
$ip – This variable stores the source IP of the trap.
-
$trapoid – This variable contains the full OID of the trap.
-
$node – This variable stores source node name of the incoming trap.