NURI is a set of tools to perform various operation in the contest of complex networks and/or devices to help the manager in keeping almost clean the "global enviroment".
This release includes major redesign in data format (XML), in languages (C), in algorithms and minor rewiews on output, operator interface and other aspects.
Enviroment and installation
NURI is native of FreeBSD, so the easiest way to have a running system is to install it on such an OS, but it run quite good also on Linux.
There are a few requirements : a C compiler, Apache, PHP, expect and dot; for each of them follow the usual installation, if needed.
The expansion of nuri.tgz will create a dir tree under current path, it can be anywhere you like and soft-linked to the "www" dir; structure is very simple, a main-dir with some files and a set of subdirs :
cfg for temporary device runconfs. log holds flow-logs. src keep C source code to be compiled; it can be deleted after creation of .bin executables. sex join Expect scripts able to query devices on demand; scripts need to be flagged as executable by everyone. tmp collect temporary/intermediate files (XML, DOT, PNG) and need to be writeable from anyone.
Some of that dirs must be manually cleaned, time by time, to waste not HD space; it has been left to human because of the unpredictable life-cycle of each file.
If NOT using the predefined path "/usr/local/www/apache22/data/nuri41/" you need to change the "include" statements at the very beginning of each PHP module to reflect your path.
Virtualization and complex networks management
The "true" change in that release is abstraction that reflects into internal data rapresentation.
Cisco, CheckPoint, Juniper, IPFW are all different, not only in price/performance but also in the way them describe instructions for a device, in keywords and in anything except the need of manageability; so the only functional solution is to abstract the virtual devices, described in terms of XML router/firewall structure and virtual syslogs described as XML syslog structure.
We need a way to populate such XML files starting from a plaintexts containing the "sh run" or equivalent command for devices; for each type a specialized-parser is then suplied.
Moreover, expect scripts can be used as on-demand-raw-data-grabbers, they just need to be created once for device.
For huge pools of devices maybe convenient the automation of expect scripts creation; Radius, LDAP and other authentication servers can be useful startpoints for that kind of informations, usually double-checked against DNS entries.
Because of variability of data-sources and rare use, that function has been kept unreachable from main menu.
Abstraction from vendor details has a price: the virtual device carry less info than the real one; but it allow a great degree of computability and intrinsec equivalence of devices as shown in NURI logic.
Huge set of questions arise in mind of net managers, driven time by time by one or other contestual matter.
Ability to answer that questions is a central point while moving "unknown-state network" into "certified-error-free network"; but questions widely spans !
There is one and only one common point to all such kind of questions : the answers exist and is implicit in some set of data, but definition of "what is pertinent" is a damnation, while it change almost everytime.
NURI solve all in one : the data are loaded into a(n equivalent of) 17-dim space and from that moment can be mixed and measured in terms of overlaps between 0 <= N <= 17 dimensional figures.
So it become possible to answer with measured data and quantities questions like
Given runconfs from same device in different times : what differences are in place between them ?
Given runconfs from different devices : what flows can [not] walk all of them ?
Given runconfs : what is minimal set of ACLs to include all subsets ?
Given model and runconf(s) : what is discrepance between "skeleton" and "active rules" ?
Given model and runconf(s) : what ACLs belongs to anyhow composed subset of rules ?
Given model(s) and runconf(s) : what will happen if "predefined" change(s) will be introduced ?
special note for discrepance function : it is somehow difficult to immagine a 17-dimension intersection between objects ranging from 0 to 17 dimensions; so 13 of them are kept hidden, those for which overlap can be either 0 or 1.
remaining 4 can be manipulated as we like, but 4-dim obj are still ostic to monitors & humans so it seem reasonable to split 1*4-dim obj into 2*2-dim projections.
but most compressed knowledge can be expressed as overlap ratio between a "reference figure" (a model-rule) and a "actual figure" (an ACL) placed in a 0 to 4-dim space (plus 0 to 13 hidden).
in other words, the intersection between a K-dim obj and a (K+J)-dim obj can't be more than K-dim wide; we can strip out J-dim from the most-complex obj, measure differences as K-dim extention/overlap ratio and express it as a pure number : the fit-percentage.
we can observe main cases :
acl fit 100% of model : it is congruent and approved.
acl fit 0% of model : still congruent, but unapproved.
acl fit <1% of model : there is a random and mostly non-nocive overlap but acl is unapproved.
acl fit >40% of model : there is a great evidence of a problem; acl is similar to model, but also different for at least one property! interpret it as a strong suggestion to check device configuration.
an acl fitting in model at a ratio between 1 and 40 % is a "doubt case", as long as your time permit : check what is happening in that "grey zone".
Last but not least improvement : NURI to Human and/or external programs communications are largely extended.
Sometimes is useful a "global" and "overall" wiew of the network, while sometimes is nedeed an indeep view of details; NURI try to overcome the problem presenting data in a human-suitable form through a flexible way to "subset" all kind of data and to present just the data-portion considered interesting in that precise context and now!.
In other words NURI keeps a very deep level of details and extract sense from them upto the exact form the net-manager likes.
The same flexibility in dumping XML to CSV format is the key to reach the second goal; any other kind of computations can be done on data stored in devices so there is XML and CSV to be used as input for anything the net-manager likes outside NURI enviroment.
Common tasks
As long as the current-running-configuration can be requested on the fly through "invisible" expect-scripts, the ability to parse them, populate some arrays and dump as XML has been included in PHP programs.
There is a converter for each kind of manageable device, Juniper and Cisco (CheckPoint needs an intermediate conversion in Cisco-style to be manageable); them are called as intermediate elaborators and the expected result is a file in ~tmp subdir that in turn will be processed by one or more of following programs.
At opposite side there is need to convert virtual L3 device XMLs into command sequences in propietary-style, that task is delegated to a flexible dumper; a program that translate XML into
a IOS plaintext.
a JunOS plaintext.
a csv plaintext.
an HTML.
For all of them there are choice in selections on both fields presence (and sorting) and ACLs "conformation".
How to get the best results
Well, there is no predefinied method to reach your goal, because it is different from any else; but, generally speaking, you are supposed to need to keep simple and ordered the runconf of some complex devices.
The more productive way is reiteration of some simple task; while the fastest, tring to do all optimizations in a single step is, at least, dangerous.
Don't delegate too much to a simple tool, it can fail in your own context; hopefully you want to keep control of any decision.
As "first step" consider order; why there is need of order in acls sequence? because of overlap.
We need as few overlap as possible ant it is very time-wasting task to keep it by hand. rules sorter does it for you; she read a virtual L3 device and sort it, giving back just what she had in input, reordered.
Ordered. How? Well, you control it but the default is a good mix in most cases; config file holds weigth of any field, as long as the asc(ending) or desc(ending) order for each of them.
Play with them as soon as you realize your own enviroment can be better be served in a different way, don'fear, it is simple & safe.
Another "productive step" is minimization of rules, by supernetting whatever possible, src&dst addresses as src&dst portranges. rules optimizer will accept a virtual L3 device as input and drop out all duplicated acls; then she does all possible kinds of supernetting, choose a nice application sequence and give back an XML with uman-directed-suggestions plus a virtual-configuration with all of the supernetting already done.
The third way to shrink number of rules is creation of groups, but the great matter is : which groups? Because, you know, the best
groups can evolve in time, and humans are unable to follow details as "have added/deleted/modified some acls; now what groups can be better re-defined?" rules aggregator does the job trying each and every productive grouping, giving back the really-minimal set of rules, possibly using group on each property (addresses & portranges on both source & destination).
Also routers can suffer of "dirty knowledge"; in a huge and complex network routers learn routes by a wide range of sources, some of them can be suboptimal and the routing protocols will "spam" that "noise" everywhere.
Just consider a pool of core routers with connected networks, some statics, some knowledge derived from OSPF, some more from EIGRP and BGP.
A simple mistake, let's say 2 supernettable networks in a router, can be amplified in many other devices and it will spread to almost any routing-protocol-able-device because of Murphy's law. route optimizer does that kind of checks, if something is supernettable it will be suggested to human to take appropriate action on appropriate L3 device.
It is unecessary to complete an optimization to move to another, NURI is tuned to play with chaotic runconfs.... but if all optimization are in place you will enjoy optimal results, that are the goal!
Error hunting
How many errors are still in place? You have seen (and corrected) most of them while minimizing the acls, but nobody knows how much incongruences exists in a given runconf. NURI was able to pinpoint 5 mismatches on a 40263 acl pool of a firewall managed by a CCIE; if you are not a CCIE, try her ability in error hunting, you see : humans have very higher error rate...
Let's play the config checker, a program that do very simple tests about congruence of ACLs and routing including static NAT (just static, no other types are managed as now!)
The output is the usual one : a XML directed to human, with just what seem wrong and another XML with all acls except the wrong ones.
WARNING! You are encouraged to double check anything.
There are plenty cases of "fake" results, let's just give an example: a firewall that knows hundreds subentets of 10.0.0.0/8 on each of interfaces, with consistent routing and a generic 10.0.0.0/8 on intf X and a SNMP-trap collector that want to be reachable by any SNMP-able device in 10.0.0.0/8.
Hundreds of specific rules can do the job in a detailed way, but a generalization as "permit SNMP from 10.0.0.0/8 to SNMP-host" applied on every interface can also do, saving huge pools of acls. NURI will list that "generic" rules as errors because of incongruences in routing (remember that on behalf of routing 10.0.0.0/8 is just linked to one interface).
Adaptive flow discovery
There are "special events" in a firewall life, as when it born. Usually a "new" firewall will be placed between already in place networks, with theyr own data-flows; obviously it will cut more or less of the "needed" flows until instructed.
Instruction of an idiot is something that humans dislike, they have not sufficient time for that kind of "clearness" in rules construction, much better seem to allow "IP from net-A to net-B", but it is in contrast with the existence of the device.
No way, an human must be involved, but can be involved just in decision while details are delegated to programs, such flow analyzer that use both virtual firewalls and virtual syslogs as inputs, decide for each connection what acl allow the flow, counting bytes&duration (graduating usefulness).
Because each acl has been splitted in it's atomic subcomponents (an acl for each src-addr/src-port/dst-addr/dst-port/protocol/interface) it become possible to have detailed composition of flows "walking the firewall" trought a given pool of acls and syslogs.
Warning ! it is recommended to use huge database of syslogs to give anybody a chance to be matched! Don't simply cast out all except what NURI shows as minimal! Maybe a "backup/failover device never use a rule; it is not sufficient to drop the corresponding acl!
Reiteration of discrepance discovery between theory and realty in flows is very powerful, but don't suppose it can be perfect when in presence of an imperfect realty description as the one derived from active flows.
Network management
At least all network tools includes someway to graph topology knowledge; NURI don't try to emulate anyone but to convert into human-frendy data hidden in complex devices.
First of all : a netmap is meaningful if and only if the data in the graph are directly driven by current knowledge, one that shows "old" and/or "planned" links is a matter in time of debug; so the maps are derived by "asking the router about actual routing" and nothing else. topology analyzer read a virtual router file and convert it in an graph through dot grapher(s), optionally compressing the routing information.
While a router knows thousands of routes trougth tenths of gateways it become necessary to adopt some kind of compression of the knowledge, so we force all possible supernetting as first step; then we package in boxes different networks with same linkpath.
It is almost sufficient, in most cases, but not for huge configs, much harder if multiple interconnections are in place; we need a more powerful information shrink as the following:
Remote networks (more than N hops far) can be moved to nearer gateways, limiting the numbers of routers represented into a graph.
The operator is allowed to state something like "in presence of K% of a network round K to 100"; i.e. when routes are 10.20.30.64/26, 10.20.30.128/25 and K less than 75 that 2 routes are rounded to 10.20.30.0/24.
The operator might give an upper limit to graph-objects, in which case K is calculated by NURI.
Choices that lead to a meaningless K (less than 50) are not allowed.
In contest of huge and complex networks one of most timewasting task is determination of "exact" location of a given host; usually the IP addr is sufficient to "guess" the correct path and "traceroute" command is a great help on that.
But nor human-knowledge nor ICMP can always lead to a usable results; host finder does the job for you, it analyze each virtual device looking for the actual path, if any.
Moreover L2 devices can be analyzed so the end point is determined as L3 & L2 path, up to switch/module/port of searched host; then a graph can be drawn from the given startpoint (always a L3 device) to target.
Security considerations
Obvious, the system (usually) keep copies of runconfs so that passwords (crypted or not) and SNMP communities are available to any user; that system is nice for single-user enviroments as for staff of people sharing the same granted-security-level.
Much more explicit infos are in "expect scripts" : to be able to query devices they need to know passwords; that bheavior can be changed with some simple adjustments in scripting, asking operator for passwords as needed.
Almost any file you supply to the system is, by definition, a source of interesting infos and therefore must be keeped as safe as you like.
Some of the functions can require huge amount of CPU and RAM; it mean that in a non-dedicated enviroment you can cause "failures" in other services just asking something to the system.
WARNINGS
'Cause of security considerations it is not a good idea to expose the server to the InterNet
Absolutely no-effort has done to consider "labels" as meaningful, so anything is expressed in the terms of "natural" objects : IP addresses, Protocol numbers & Port ranges.
Coping & Pasting 100K ACL rows can lead to really embarassing situations.... for huge configs consider the use of "expect scripts" to push ACLs into devices.
As work-in-progress there are bugs that will be fixed as time permit, to have more support mail me at a-dot-spinella-at-rfc1925-dot-net.
You are supposed using devices with huge traffic flows and complex configurations : DON'T TURN OFF THE BRAIN WHILE USING NURI !!!
That work has been done is in full respect of RFC 1925 and is dedicated to Nuria Molto', with love.