APP下载

Ring Protection and Survivability Mechanisms for Packet Transport Networks

2010-06-05LuYueming

ZTE Communications 2010年3期

Lu Yueming

(Key Laboratory of Information Photonics and Optical Communications of Ministry of Education,Beijing University of Posts and Telecommunications,Beijing 100876,P.R.China)

Abstract:Packet Transport Networks(PTNs)must resolve issues in their protection and recovery mechanisms.These issues include the detection performance of Operation,Administration and Maintenance(OAM),network resource optimization,resource allocation deadlock,and resource deployment blocking.Traditional protection and recovery mechanisms cannot meet the requirements of PTN.In order to improve the protection and recovery performance,network resource use,probability of service recovery,and to decrease the probability of service blocking,this paper introduces overlapped segment shared protection,Pre-configured Multi-Cycle(P-mcycle),conflict-free algorithm,and delay restoration algorithm.Only with appropriate protection and restoration mechanisms is it possible to achieve smooth evolution from TDM networks to integrated packet-based bearer networks and all-IPservices.

This work was funded by the National High Technology Research and Development Program of China(“863”program)under Grant No.2007AA01Z252.

T he development of metro transport networking has been driven by access and transmission demands of broadband data services(e.g.3-play),Ethernet Private Line(EPL)enterprise services,Layer 2 Virtual Private Network(L2VPN)services,and common broadband services.As service and bearer networks move towards IP,basic transport networks are evolving into Packet Transport Networks(PTNs)[1].

With good scalability,powerful OAM,and fast protection switching,PTN inherits some of the characteristics of traditional Synchronous Digital Hierarchy(SDH)transport networking.New features such as packet switching,statistical multiplexing,connection-oriented label switching,Quality of Service(QoS)guarantee,and flexible dynamic control are added for adaptation to data services.These are basic but important technologies for network convergence.In terms of transport mode,PTN not only provides packet data services and traditional Time Division Multiplexing(TDM)services,but also offers Asynchronous Transfer Mode(ATM),Inverse Multiplexing for ATM(IMA)[2],and Multi-Level Pre-Emptive Priority(MLPPP)to meet transmission demands of 3G mobile systems.PTN devices adopt severalencapsulation and adaptation technologies,integrating various TDM and data services on a unified packet transport plane.PTN is thus a unified,packet-switched multi-service transmission platform[3].

However,integration of various demands and diversified application scenarios creates some critical problems for survivability.This issue directly impacts both the networking mode(ring,mesh,or star)and the service setup mode(1+1 or 1:1 path protection)of a PTN network,and thus affects services.The survivability of PTN involves blocking during Label Switching Path(LSP)setup,resource utility,and coordination for resource allocation[4].Unlike SDH networks,PTN protection mechanisms are still being studied and protection performance is far from adequate.

Protection and restoration strategies of PTN play an important role in the network’s service performance,resource utility,and survivability.Because PTN imposes high requirements on survivability,research on its protection and restoration strategies is important.Only with adequate protection and restoration mechanisms is it possible to evolve smoothly from TDM-based transport to packet transport,to achieve network convergence and to deliver all-IP services.

1 PTN Research Status

There are currently two main technologies for implementing PTN:Transport Multiprotocol Label Switching(T-MPLS)[5]/Transport Multiprotocol Label Switching Traffic Engineering(MPLS-TE)[6],and Provider Backbone Bridge(PBB)/Provider Backbone Transport(PBT)[7].Research into survivability strategies for PTNs with these 2 technologies has attracted much attention.Internationally,institutes and organizations such as Alcatel[8],Nortel[7],Cisco[9],Sycamore[10],European Union Seventh Framework Programme(FP7)[11],the University of Texas[12],Stanford University[13]and Bell Labs[14]are carrying out research.In China,telecom equipment manufacturers such as ZTE,Huawei,and Fiberhome are studying survivability strategies,while research institutes at Tsinghua University,Peking University,Shanghai Jiaotong University,and Beijing University of Posts and Telecommunications are initiating research into protection and restoration mechanisms from the perspective of resource management,distributed parallel resource allocation,and contention.

The International Telecommunication Union-Telecommunication Standardization Sector(ITU-T)and the Internet Engineering Task Force(IETF)have brought forward their own solutions.ITU-Tdefines T-MPLS,and supports 1+1 and 1:1 linear protection(G8131)as well as wrapping and steering ring protection(G8132)[15].IETF prefers the Fast ReRoute(FRR)[16]technique of Multiprotocol Label Switching(MPLS)to implement 1:N linear and ring protection.In 2008,IETF and ITU-Testablished a Joint Work Team(JWT),and currently,experts within this team are studying ring protection requirements of MPLS Transport Profile(MPLS-TP).Limited by its global labels,PBB Traffic Engineering(PBB-TE)supports 1:1 linear protection,but does not support subnetwork protection and connection-based ring protection.

In China,PTN protection and restoration technologies are still in the early stages of research and discussion,with network device manufacturers focused on carrier-class protection switching strategies.Key technologies being considered include fault detection and fault notification.These technologies aim to achieve an overall switching time of less than 50 ms,supporting MPLS FRRprotection(by enabling local protection of important links and network elements in the network topology),and supporting Link Aggregation(LAG)protection of the links.In 2008,the China Communications Standards Association(CCSA)established a forum specifically focused on transport devices based on a unified switching plane.In 2010,the forum has turned to T-MPLSprotection mechanisms.

Overall,current research into PTN protection mechanisms is focused on improving T-MPLS 1+1 and 1:1 linear protection—as it no longer meets the demands of various services.PTN requires multi-level multi-layer protection and restoration.To achieve this,path-level and link-level protection as well as quick routing or rerouting of services by Generalized MPLS(GMPLS)intelligent software is necessary.

In fact,precursors to packet network protection mechanisms already exist.W.Grover,for example,introduced the concept of P-cycle[17](designed for IP networks).Before a fault occurs,a set of P-cycles are pre-configured and routed through all nodes and links to be protected,and resources are reserved for them.When a fault occurs,switching is performed according to the pre-designed ring.In this way,ring protection is achieved.In a public patent,Robert Sultan proposes a server trail failure message notification mechanism.This increases the probability of service recovery in cases of network congestion[18].Raymond Xie of Sycamore suggests an intelligent protection and restoration mechanism in EtherOptics-based PTN[19-20].

2 Challenges for PTN Survivability

Survivability is an important issue that must be addressed before deployment of carrier-class PTNs can occur.Challenges for PTN include survivability evaluation and optimization,distributed deadlock and resource contention,PTN-oriented protection and restoration blocking,as well as fault detection of OAM.

(1)Survivability Evaluation and Optimization

Among these challenges,survivability evaluation and optimization is particularly troublesome.First,determining reasonable criteria for evaluating survivability is difficult.Second,there are few good optimization methods and technologies for PTN survivability.Perhaps due to limitations in human cognitive ability when compared with computing capability,many optimization issues seem particularly difficult to solve.People simply do not know what the best network is.Moreover,limited computing capability makes the method of exhaustion infeasible.Only algorithms such as genetic and greedy algorithms can be used to find a better solution to survivability optimization in distributed networks.(2)Distributed Deadlock and Resource Contention

To a greater or lesser extent,PTN may encounter distributed deadlock and resource contention problems when establishing a fast LSP,preempting a fast LSPin 1:N protection,or reselecting a quick LSP.At present,the ostrich algorithm is commonly used to solve these problems.This algorithm meets the minimum requirements of IP networks,but cannot realize fast protection and restoration.Distributed deadlock is an awkward problem affecting computer operating systems and networks,and it has yet to be solved.For computer operating systems,Dijkstra proposes banker’s algorithm,but its effectiveness has been proven poor.The deadlock problem receives little attention in the research of MPLS;however,in GMPLS research,Zafar Aliet alhave begun to think deeply about the problem and have addressed it to the IETF.Currently,studies into deadlock are underway.

(3)Blocking in Deploying PTN-Oriented Protection and Restoration Some protection mechanisms similar to SDHhave already been adopted in PTN.However,in distributed routing environments rigid protection paths are set up,and blocking with these mechanisms is very high.To solve this problem,lenient protection and restoration mechanisms must be worked out to complement traditional rigid ones,and to meet the specific requirements of PTN.

(4)Fault Detection of OAM

OAM detects Peer-to-Peer(P2P)connectivity status using connectivity detection messages which interact periodically.Along with other methods,it detects packet loss rate,delay,and jitter,and thus achieves loop-back,alarm suppression,and alarm feedback.Moreover,OAM uses Automatic Protection Switching(APS)protocol to protect the two End-to-End(E2E)paths.Whatever the situation,SDH-like protection mechanisms must provide carrier-class protection,and this implies a switching time of no more than 50 ms.The time for the OAM engine to complete path fault detection should not be too long.Specifically,to ensure an overall switching time of less than 50 ms,fault detection and switch triggering must be done 25 ms after a fault occurs.Therefore,OAM fault detection performance is very important.

3 Ring Protection and Survivability Mechanisms for PTN

Most core networks of PTN adopt a mesh structure,so ring protection differs to that of SDH.Here,research carried out by BUPTconcerning ring protection and survivability

mechanisms of PTN is introduced.

(1)Overlapped Segment Shared Protection Mechanism

After intensive analysis of constraint mechanism and overlapped segment protection mechanism,the 2 mechanisms were integrated,and overlapped segment shared protection algorithm[21]was used to dynamically adjust the working and protection paths based on link weights.This algorithm provides multiple overlapped protection segments in the entire working path,enabling reasonable and effective selection of these overlapped protection segments.Compared with traditional protection algorithms,the overlapped segment shared protection mechanism enhances network connection reliability,and improves the utility of network resources by reasonably sharing resources among overlapped protection segments.

▲Figure 1.Non-overlapped and overlapped segment protection.

This mechanism protects the entire end-to-end working path using multiple backup protection segments.Compared with path protection,its limited restoration scope is a major advantage.When a link or node fails to work,only its corresponding protection segment is activated.As a result,restoration time is shortened.The mechanism does not provide a protection segment for each link in the working path,so it achieves higher resource utility than link protection.Instead,it provides a backup overlapped protection segment for each segment in the working path,allowing some working links to fall under the protection of several backup segments.This enables flexible network resource allocation and increases the survivability of the network.Traditional non-overlapped segment protection mechanisms cannot protect nodes between segments.Overlapped segment protection mechanism resolves this problem.With neighboring segments overlapped,a faulty node between segments will not lead to failure of both working and protection segments.Overlapped and non-overlapped segment protection is illustrated in Figure 1.

(2)Pre-configured Multi-Cycle(P-mcycle)Scheme

To resolve uncertain restoration of a fault point in mesh networks,a new restoration scheme,P-mcycle,is introduced.This scheme uses Integer Linear Programming(ILP)optimization algorithm(for P-mcycle generation),as well as Sub-cycle secondary routing algorithm.P-mcycle ensures a valid fault restoration time,greatly enhances fault protection success rate,and enchances the efficiency of protection resource reservation.Thus,it improves effectiveness and reliability of restoration in PTN.

P-mcycle generation algorithm is based on ILP.It introduces a Pring for fault-independent path protection.However,the algorithm involves complex computation which often takes a long time.For a small network,P-mcycle generation takes 1 hour;and for a large network,2 days.In contrast,ILPoptimization algorithm leaves variables and constraints such as dynamic service distribution and service symmetry to secondary routing algorithm.A pure ILPmodel is only used for sourcing optimal P-ring combinations.In this way,the computation involved in P-mcycle generation decreases considerably,and the generation task becomes static planning of network topology.

With the number of variables and constraints decreased,and computational complexity dramatically reduced,an effective pre-selection method can be used to narrow down the number of possible solutions to an ILPproblem.In a given network topology,enhanced Depth-First Search(DFS)algorithm is used first to search all simple rings.Then,the searched rings are checked one by one to determine whether their links satisfy the constraints.M rings with the largest AEvalues are selected to form a sub-optimal solution space.Finally,ILP optimization algorithm is used to select a group for P-mcycle.P-mcycle obtained in this way can protect all working wavelengths,and ensure the preset maximum reserved wavelength is not exceeded.Service is 100%restored with the least resources.

▲Figure 2.Conflict-free algorithm.

Secondary routing algorithm simplifies pre-configuration computation in traditional P-cycle,and optimizes protection path selection during switching.With P-mcycle,fault restoration time satisfies the rigid survivability requirements of PTN.Compared with traditional P-cycle,P-mcycle has a higher protection success rate,better resource utility,and better efficiency.Simulation results show that optimization is more pronounced in the Mesh network where the average node density is low.

P-mcycle research will continue to focus on the following aspects:achieving P-mcycle optimization based on node fault restoration and multi-link/multi-node restoration;enabling reasonable distribution of P-mcycle generation resources by taking into account dynamic service distribution changes during candidate ring selection;and studying traffic engineering involved in secondary routing based on the direction of services in order to improve service protection success with fewer network resources.

(3)Conflict-Free Algorithm

In establishing paths,2 nodes request the same resources(port or wavelengthλ)at both ends of a link and send Reservation(Resv)messages.When the messages reach the other end of the link,the requested resources may already be occupied.In traditional solutions,the node receiving the Resv message treats this as a path error,and returns a Reservation Error(Resverr)message.If resources at both ends are occupied,the nodes at both ends determine that their paths have not been established.As a result,the resources used for path establishment have been wasted.

Conflict-free algorithm avoids such a problem,as shown in Figure 2.This method works on the principle of adopting a priority comparison strategy for the Resv messages sent at the same time,enabling one path to be established[22].Suppose conflict takes place between Node C and Node D,and the resource allocated by Node C for Path 1 is the same as that allocated by Node D for Path 2.In traditional solutions,Resv 1 message sent by Node C arrives at Node D,the requested resource has been occupied,and Node D treats it as a path error.Similarly,Node C registers a path error when it receives Resv 2 message from Node D.Therefore,neither path can be established.

In conflict-release algorithm,a counter iis carried along with a path message.The counter is set to 0 at the source node.A"1"is added whenever the path message reaches a node,and the new value is stored at the node.When the path message arrives at the destination node,the value of the counter corresponds to the number of links needed to establish a path from the source node to the destination node.As shown in Figure 2,to establish a path from Node Ato Node E,5 links must be passed;from Node B to Node E,4 links must be passed.

Using conflict-free algorithm for establishing long-distance paths is unlikely to be successful.Suppose there are 2 path establishment requests:Request A involves 7 links,and Request Binvolves 3 links.If the conflict link is 2 links away from the source node of Request A,and one link away from the resource node of Request B,Request Bwillsucceed in establishing a path but Request A will fail.As a result,the 5 paths that have been established by Request Abefore the conflict must be released,resulting in resource waste.

(4)Delay Recovery Algorithm

Delay recovery algorithm is used for handling network faults,adopting sequenced delay during network fault recovery.

After a fault occurs,the algorithm allows the source node of the service to send a delayed Notify message.As shown in Figure 3,a time interval t is inserted between two Notify messages.With this method,the probability of resource conflict is reduced in the re-routing and signaling process for fault recovery.When services in two opposite directions are recovered,signaling messages are sent in a certain sequence and no resource conflict occurs.

4 Conclusion

PTN is a future direction in telecom network development.However,ring protection and survivability directly impact its QoS.PTN must resolve issues in fault detection,network deployment,protection and restoration.Existing protection and restoration methods cannot meet the requirements of PTN and new mechanisms must be worked out.These challenges will eventually be solved in the future.

◀Figure 3.Delay recovery algorithm.