Archive for June, 1997

Forthcoming Contributions

by

DOI: 10.1023/A:1018779013170
Print publication date: 6/1/1997
View article on SpringerLink

No comments

Trends in Layered Network Management of ATM, SONET, and WDM Technologies For Network Survivability and Fault Management

by Doverspike, Robert

DOI: 10.1023/A:1018727029099
Print publication date: 6/1/1997
View article on SpringerLink

No comments

Empirical Evidence of Reliability Growth In Large-Scale Networks

by Snow, Andrew P.; Weiss, Martin B. H.

An analysis of major telecommunications outagesexperienced by a nation-wide network is presented. Thepurpose of this analysis is to examine the utility ofNonhomogeneous Poisson Process (NHPP) models in characterizing large-scale network failurebehavior. The analysis not only shows the suitability ofthis theory, but also demonstrates the reliabilitygrowth of these network services for the time period studied. Modeling network failures as a NHPPalso allows the decomposition of the failure intensityinto individual hazards, providing insights into failurecauses. In addition, network reliability can be characterized in classical probabilisticterms. The usefulness and limitations of this techniqueare discussed.

DOI: 10.1023/A:1018774912261
Print publication date: 6/1/1997
View article on SpringerLink

No comments

The Effect of Detection and Restoration Times for Error Recovery in Communication Networks

by Logothetis, Dimitris; Trivedi, Kishor

Detection and restoration times are oftenignored when modeling network reliability. In thispaper, we develop Markov Regenerative Reward Models(MRRM) to capture the effects of detection andrestoration phases of network recovery. States of the MRRMrepresent conditions of network resources, while statetransitions represent occurrences of failure, repair,detection, and restoration. Reward rates, assigned to states of the MRRM are computed based on aperformance model that accounts for contention. Wecompare our model with ones that ignore these parametersand show significant differences, in particular for transient measures.

DOI: 10.1023/A:1018722928191
Print publication date: 6/1/1997
View article on SpringerLink

No comments

Optimal Spare Capacity Preconfiguration for Faster Restoration of Mesh Networks

by Macgregor, M. H.; Grover, W. D.; Ryhorchuk, K.

Several distributed real-time methods have beenproposed for restoration from single span failures indigital transport networks. These methods have thepotential to avoid user service outages due to such failures, if they operate quickly enough. Forexample, switched 64 kbps connections will not bedisconnected if the network can be restored before thetime at which calls in progress are dropped, typically 1-2 seconds after a failure. However, it willbe difficult to achieve the goal of sub-secondrestoration if cross-connects cannot operate crosspointsquickly enough, either due to large workloads during a restoration response, or because ofimplementation choices such as testing eachcross-connection while in the midst of a serious outage.The results in this paper demonstrate that it can beuseful to pre-operate selected cross-points between thespare links of a mesh-restorable network before anyfailure has occurred, putting the network into astatistically optimal state of readiness. When a failure occurs, some of the preconfigured restorationpath bundles can be used immediately. If morerestoration paths are needed, they can be obtained by areal-time restoration process. The first advantage of preconfiguration is that the number ofcross-connection operations may be greatly reduced oreliminated for a portion of the affected traffic. Thiswill reduce restoration time significantly. Secondly, after utilizing preconfigured restorationpaths, the workload of a real-time restoration processwill be lower because it will be searching for fewerpaths. This paper demonstrates that preconfiguration can supply a significant proportion of thereplacement capacity required after a span failure. Theresults are obtained through integerprogramming.

DOI: 10.1023/A:1018770811352
Print publication date: 6/1/1997
View article on SpringerLink

No comments

Sequential and Parallel Approaches to Incorporate Reliability in the Synthesis of Computer Networks

by Chamberland, Steven; Sanso, Brunilde

This paper presents a scenario-orientedoptimization model and solution algorithm to solve thejoint routing/capacity assignment problem for computernetworks. The advantage of this model is that failures are directly dealt with at the design stage.The model helps to find a suitable trade-off betweencapacity assignment and performance in the event offailures. As accounting for major failures can be very time consuming, we introduce parallelism as atool to solve this type of problem. Two parallelversions of the algorithm were implemented. Bothparallel versions were found to be extremely efficientin reducing computational time, the one presentingtwo levels of parallelization was found more suitablefor larger networks.

DOI: 10.1023/A:1018718827282
Print publication date: 6/1/1997
View article on SpringerLink

No comments

A Generic Model for Fault Isolation in Integrated Management Systems

by Katker, Stefan; Geihs, Kurt

Distributed systems in enterprises as well astelecommunication environments strongly demand moreautomated fault management. A single fault in thesecomplex systems might cause a huge number of symptomatic error messages and side effects to occur. Thecommon root faults for these symptoms have to beidentified to start fault removal procedures as soon aspossible and to decrease system down-time. This paper presents a methodology for fault isolation inintegrated management systems. A generic model isdescribed that unifies the view of the management systemon the managed environment. It integrates the relevant aspects of network, system, and servicemanagement layers in order to perform integrated faultisolation. Our approach is based on a general dependencygraph model. It captures the information that isrequired to determine the root cause of a fault on theone hand, and the set of fault affected services andcustomers on the other hand. The layered TMNarchitecture serves as an example for an integratedmanagement environment throughout this paper.

DOI: 10.1023/A:1018766610444
Print publication date: 6/1/1997
View article on SpringerLink

No comments

The Case of the Creeping Error, or 1:3:3:1

by Bernstein, Lawrence; Yuhas, C. M.

DOI: 10.1023/A:1018714626373
Print publication date: 6/1/1997
View article on SpringerLink

No comments

Towards Fault Recovery and Management in Communication Networks

by Medhi, D.; Tipper, D.

DOI: 10.1023/A:1018762509535
Print publication date: 6/1/1997
View article on SpringerLink

No comments