Encyclopedia Knowledge

February 13, 2007

[HA] Introducing the Open Cluster Framework

Filed under: High Availability — encyclopedia @ 8:10 am

http://www.linuxjournal.com/article/6143

Introducing the Open Cluster Framework

Software

Talking with Alan Robertson (excitedly) about HA, HPC and their future in open-source clustering.

Those of you familiar with the Linux High Availability (HA) scene will recognize Alan Robertson’s name immediately. After all, along with people like Harald Milz and Lars Marowsky-Bree, he’s one of the main names in HA Linux, a frequent HA contributor and the owner of the linux-ha.org web site. Although his contributions in the area of HA might be well known to the community, Alan has project in his background that is less well known but may prove to be instrumental in both the HA and High Performance Computing (HPC) cluster communities–namely, the Open Cluster Framework project.

The goal of the HA Project is to provide an HA clustering solution for Linux via community development, and the goal of OCF might be even more ambitious: to define APIs that provide basic clustering functions and to provide a reference implementation of the API. Note that these APIs do not extend only to HA clusters, but include HPC clusters as well. What is so ambitious is that at a time when many in the Open Source community are trying to develop solutions around a single project, the OCF is concentrating on trying to unite all the open-source HA projects and unify two majors camps, HA and HPC, that are often thought of as separate entities.

The OCF Project itself is in its infancy. First presented at the Ottawa Linux Symposium in July 2001, the group is in the early stages of defining itself, aligning with various groups and supporters and coming up with a preliminary architecture. The philosophy of the OCF is that although HA and HPC have been largely separate over the years, they share many common clustering problems and would realize advantages by sharing code. As such, the intent of the group is to define and develop building blocks that can be used by the different cluster disciplines to build distinctly unique clusters suited to each cluster’s needs. Since OCF is defining an API for the building blocks as well as a reference implementation, the group expects there will be different implementations as well.

I could bore you with my personal spin on the OCF, but I recently had the opportunity to meet with Alan Roberston on his visit to beautiful, sunny Poughkeepsie, New York. If you’ve ever met Alan in person, he’s nothing if not entertaining and animated. At various times during our interview he flapped his arms, jogged in place and slammed his hand on the table, jarring the tape recorder. He’s clearly an evangelist for OCF, and he’s on a mission to join the various HA and HPC factions.

Richard: Alan, let’s introduce you to our readers. When did Linux first appear on your radar screen?

Alan: Well, let’s start with UNIX. I first used UNIX in 1978 when I worked for Bell Labs. I think I first started using Linux in 1993. There was a fellow in the office that was big on Slackware, but I liked the advantages of a packaged system. I think it was 3.03 Red Hat that I first ran, with the 1. something kernel.

Richard: And when did you start contributing to open source?

Alan: There’s a little bit of a story before that. Being a person that had done a lot of UNIX kernel hacking in the past, I was delighted to find that the code mistook my Mitsumi CD-ROM for a Sony CD-ROM, so I could find the driver and make it recognize my Mitsumi. So, that was my first and maybe my only Linux kernel hack; I’ve done mostly user-space Linux stuff. My first contribution was the high availability stuff. My job at the time was technology planner for R&D for Bell Labs. Ken Switzer, my second line manager at the time, asked what Linux had for high availability software. I didn’t know, so I went away to find out. I found out that they had a mailing list and a developer’s HOWTO by Harald Milz on how to write high availability software. I had read Eric Raymond’s The Cathedral and the Bazaar, and I had been fascinated by the idea that it captured something important. It wasn’t the hacker mentality or the communal developer approach; it was that by traditional models, Linux should be an abysmal failure. It breaks every single known rule of software development. It doesn’t have a plan, it doesn’t have an architecture, it doesn’t have an architect, it doesn’t have a waterfall model, it’s all done haphazardly [Alan's voice is dripping with sarcasm at this point] and yet, the evidence is, it’s been wildly successful. It captured my sense that a lot of software development methods were nonsense.

Richard: Did you relate this idea of the Cathedral and the Bazaar to HA?

Alan: This is a diversion, hang on. I knew a lot of people developing code; they were all good, and none of them followed the conventional software development models. I felt that Raymond got at something important, and the itch it created in me was to participate in an open-source project to experience it first hand.

I went to visit my in-laws over Christmas, when one tends to have some free time, so I brought my laptop with me. It was the perfect HA cluster. How, you ask, can one node be an HA cluster? Because the other node is dead! [lots of laughter]. So while I was at my in-laws for a couple of weeks I wrote the heartbeat module, came back and announced to the list that I had written some software. But it wasn’t as much to learn about HA as it was to learn about the open-source process. I didn’t do it to become a project leader, I did it to write some software and scratch my personal itch to learn how open-source development worked firsthand.

Richard: How successful has Heartbeat been since its release? Give me an idea where it’s been used.

Alan: It’s used by one medical imaging company that has to be on-line to give doctors the images they need. It’s used at Los Alamos National Lab, where it’s really important that your badge readers actually work, where security is especially high, or there’s a security/safety issue. My guess is there’s probably several thousand real, true production deployments of this software.

Richard: When did you start to think about the implications of other people in the HA space?

Alan: When other people came into the space. I wrote this code because there was none. I would not have written it had there already been code–I needed a good reason to write it myself.

To go on with the story, I got a call from Germany one day, from a Volker Weigand. Volker was a contributor to my project at the time, and he called to tell me SuSE was expanding their staff and was very interested in high availability. Eventually Volker said that he’d like me to come work for SuSE. I was concerned because I didn’t speak German, and I didn’t drink beer, and I didn’t want to move. Well, over time Volker got me excited about it and invited me for an interview in Germany. Shortly thereafter I joined SuSE. Just as I joined them, they announced they were going to partner with SGI, who wanted to open-source its HA package, Failsafe. So now, I was going to help introduce Failsafe to the Open Source community; they didn’t want to ruffle any community feathers, and the feathers they didn’t want to ruffle were mine! [laughs] So, they wanted me to help them through the process.

Richard: Your experience with HA must have been invaluable at this point.

Alan: That and my experience with open-source development, but things got weirdly personal. At this point I was the head of two competing open-source projects. Schizophrenia is the word that comes to mind to describe my mental state. Now you see how I have the perspective I have–I have heartbeat and I have Failsafe and I’m head of both projects. So, on one hand SuSE wants me to write a component for a reset service for Failsafe. At the time, I needed a reset service for heartbeat too, and I didn’t want to write two reset services. So, this is really very personal. Most of all it occurred to me that no one in open source should care how the machine gets reset. It’s not a selling point; as long as the machine gets reset, that’s all that matters.

Richard: So at this point, you had detected a baseline service that everybody needs.

Alan: Yes, and I’m in charge of two projects that both need it. I looked at Failsafe, and I thought it would be complicated to embed the code right in Failsafe. I thought it would be better to provide the function in a service that only does reset.

Richard: Now at this point, you don’t have an inkling about anything called Open Cluster Framework?

Alan: I have an inkling that this is not the only example of needing a component like this, and I had it in the back of my mind. I didn’t have a name for it.

Richard: So, reset was chosen for you as the first component, and you realized this is not the last example of a service needed in a lot of different clusters.

Alan: I also noticed that I liked some things about heartbeat better than Failsafe, and I wanted to use my code in Failsafe. And Failsafe had some stuff that I wanted.

Richard: How did you resolve your schizophrenia?

Alan: Well, I created this reset service and put it in heartbeat because I controlled the write access there–I didn’t have Failsafe write access yet. So I developed it and tested it for heartbeat, but I kept it completely outside of Failsafe. I said, here’s the nature of reset services. You can do things like ask it, “What kinds of machines can you reset?”, and reset this computer by name. Anyway, over time it became the de facto standard for resetting computers; there are 12 implementations and it’s used in three open-source projects.

At the start, we had one reset component, and my friend, Lars Marowsky-Bree, was working to get SuSE Linux and Failsafe certified as an official SAP platform, which is a big accomplishment, a lot of work. It means a lot in Germany–if you had to pick one mission critical application it would be SAP. So, Lars had a big demonstration coming up in three days, but the only reset device he had ran on 110 power only. So, he had a power switch and I didn’t have any specs, and he wants to demo it for Failsafe and heartbeat. Coincidentally someone asked me to go talk to the Atlanta Linux User’s Group. So I have this plugin architecture, and someone in Atlanta happened to write a plugin that matched the power switch that Lars had in Germany. And Lars needed it in like three days time, and I found it because I went to the Atlanta LUG! So they put it in, Lars picked it up, and the demo went flawlessly.

My reaction at the end of all this was, this open stuff works! We can accomplish something far beyond what we could have accomplished ourselves. We are working in a way that makes it possible.

Richard: It goes right back to Eric Raymond.

Alan: It goes right back to Eric’s observations…let’s do this again! [Alan is jumping around the office.] So not long after this, a project called Kimberlite came on the market. Similar to Failsafe, it came to market under the closed-source rules, and they wanted to become open source. Well, they wanted to use our reset solution as well. And they contributed code that Failsafe uses and heartbeat uses.

Richard: So, you’ve got three groups all working on HA?

Alan: Yes, and we’re all going off in separate directions.

Richard: Is this the environment that spawned OCF? When did it formalize?

Alan: Yes, you can see how this is the environment for a framework. So, I was preaching that we should have common components.

Richard: So how did OCF start to grow?

Alan: Backing up a bit, I sat down one day and thought about why the project wasn’t going forward. I mean, I was writing things but no one else was contributing. So I thought, why is that? Maybe no one knows what to do. I never sat down and told anyone what to do. So, I took several hours and wrote a to-do list. There were some critical things that I wanted to do, but I put them down on the list anyway, and posted the list on the mailing list. It was 90 minutes later when someone volunteered.

Richard: So this is when you realized the whole open-source methodology works?

Alan: Not only does it work, but look at the speed! I was shocked to see such a quick reaction–someone from Finland mentioned that heartbeat didn’t have authentication. I realized that if Linux-HA was going to grow, it required a simple configuration and good security (to protect the users). Security was probably the top thing on my list that I didn’t think I had to do myself. So, three or four weeks later he turned the code in, and it was my first big contribution from someone else.

Richard: You found out how useful a to-do list was in running the project–it’s kind of like Tom Sawyer white washing the fence?

Alan: Well, yeah, you have to tell them what fence we’re painting today. A lot of people want to help; they don’t want to spend time finding out what to do, they just want to do something. Part of my job as a project leader is making sure that the right person is matched to the right project. My wife calls it being a good king.

Richard: How would you characterize the state of the OCF today?

Alan: We’re at the point where we’re doing internal drafts of standards in the two to three key areas we’re working in. We’ve done lots of work to get participation, two or three proprietary vendors, basically every open-source HA project and several HPC participants as well. One of the things I want to point out is that OCF does not have high availability in its title. That is deliberate, because when you have a cluster of computers, regardless of what they do, they all have some fundamental functions. For example, sometimes you want to kill a node in both HA and HPC. What you don’t want is both services fighting over who gets to shoot the node. So you need to coordinate that, and you don’t want two ways of doing it.

Richard: You only shoot the node once, and only in one way.

Alan: Yes, what you don’t want are two contradictory truths because eventually something bad will happen. If you have an HA membership layer, and an HPC membership layer and each of them thinks the membership consists of a slightly different set of computers, something is not going to work.

Richard: Yes, that’s patently a bad thing, isn’t it?

Alan: Yes, it’s a bad thing. But on the other hand, if you have a membership layer that the two of them can share from a single API, then you’ve eliminated this type of possible error. It won’t make everything work, but it makes it possible for everything to work.

Richard: Would you say that the OCF is inclusive, that anyone can join?

Alan: Absolutely, I have gone out of my way, repeatedly, to actively solicit members to the OCF. I would say that the only groups that have actually refused are those that don’t have the resources to apply to it right now.

Richard: Is there a hierarchy in the group, and is it formal or informal?

Alan: The group is run informally. The two people that have been pushing it the hardest are myself and Lars Marowsky-Bree. We kind of drew up a map of what we wanted to cover in the standards, and at that point anyone who wanted to participate in each area could. We communicate via e-mail, by voice conferences and by face-to-face meetings. I don’t think we do enough with voice and face-to-face meetings.

Richard: Alan, what is the association between OCF and the Free Standards Group (FSG)?

Alan: We are in the process of applying to become an FSG working group. I’ve followed all the procedures I’ve been given so far, and I’m waiting to find out what to do next. We are very interested in joining the FSG, and they are excited about the possibilities of working with us.

Richard: Do you feel this affiliation is advantageous to the OCF?

Alan: Yes, first because the OCF is committed to providing standards for Linux. Our open standards are oriented toward Linux; however, we do nothing to preclude them from other versions of UNIX. We try not to do anything to make it unable to run on FreeBSD, for example. So I believe that association with the FSG is good because it is the primary standards body for Linux.

Richard: Tell me, how much buy-in have you gotten from the HPC community?

Alan: With the HPC community we’re engaged in a process. There are really two or three major HPC projects that are looking at something analogous to OCF–one is OSCAR and the other is NPACI Rocks. There is mutual interest and an understanding (between HA and HPC) that this might be a profitable way to cooperate. And there are various amounts of effort to make this happen. Everyone is busy doing their own thing, so some of it is learning each other’s language so we can communicate. I personally am spending time learning about how the HPC community functions so we can have some kind of mutual terms. So, we’re in a socialization process.

Richard: So you’d say the HA group is leading the OCF project, but information is flowing back and forth between HA and HPC, and there seems to be some interest between the leading projects and OCF?

Alan: That’s much more concise, yes. I haven’t seen a desire to duplicate what we’re doing or to dismiss it. So, that’s sufficient for now. It’s clear that we have different cultures and languages to talk about these things, which adds to the difficulty in joining us all together.

Richard: Being a leader of the OCF, what do you see happening in the next three to six months, and then after that?

Alan: I see us coming out with a real external draft of the standard in the next six or so months. During this time we expect to continue our involvement with the FSG, and then in the next three to six months, to become an official working group. We also expect to see the beginnings of a reference implementation. Beyond the six months, I expect to circulate this draft, collect comments from everyone not yet involved, pick up momentum and get additional attention. During the next six months, I expect some of the HA and HPC work to come to fruition. The HPC world will start to look at the APIs and wonder how to use them.

Richard: Alan, one final question: what did I forget to ask that you’d like to address?

Alan: Two things: clusters are revolutionary, and open-source clusters on commodity hardware are even more so. Our ability to make this revolution happen and reap the benefits for all the people who want to use clusters is dependent on our ability to work together. In that respect, it’s very dependent on things like standardization. However, standardization on high-end systems can change things dramatically–the top scientific machines are all clusters. The most cost effective clusters are overwhelmingly open-source clusters built on commodity hardware. They are radically less expensive than their ancestors. If you look at high availability, something similar applies, but people haven’t come to realize it so quickly because there hasn’t been someone like Donald Becker pushing the idea so hard. So maybe three to five years from now HA will have the same realization as HPC clusters have today, in ways that people never thought to apply them. We have a chance to do 100 times the number of HA clusters than we did before because the cost barriers are down. The potential here is tremendous, and we need to leverage it. Standards are an important part of making this happen.

Resources

An article on the 2001 Ottawa Linux Symposium by Forrest Cook in Linux Weekly News

The High Availability Web Site

The Open Cluster Framework Web Site

Richard Ferri is a senior programmer in IBM’s Linux Technology Center, where he works on open-source Linux clustering projects such as LUI and OSCAR. He now lives in upstate New York with his wife, Pat, three teenaged sons and three dogs of suspect lineage.

[HA] Để website luôn online với cluster Apache High Availability Linux

Filed under: High Availability — encyclopedia @ 3:16 am

Để website luôn online với cluster Apache High Availability Linux – 8/11/2006 8h:34

 

Cluster sửa chữa lỗi (failover cluster) được dùng để đảm bảo tính sẵn sàng cho các dịch vụ và ứng dụng hệ thống khi bị tấn công, xử lý các lỗi phần cứng và rủi ro do môi trường. Trong bài này chúng tôi sẽ hướng dẫn các bạn cách thức thực hiện một cluster Apache hai nút, chắc chắn tin cậy và hiệu quả cao với ứng dụng thú vị của dự án The High-Availability Linux. Cluster này đã được kiểm tra trên các phân phối Fedora Core 5, CentOS 4.3, và Ubuntu 6.06.1 LTS server.

Trong môi trường cluster, hệ thống ‘có tính sẵn sàng cao’ (high ability – HA) chịu trách nhiệm bắt đầu và kết thúc các dịch vụ, cài đặt và gỡ bỏ tài nguyên, giám sát khả năng sẵn sàng của hệ thống trong môi trường cluster và điều khiển quyền sở hữu địa chỉ IP ảo chia sẻ giữa các nút cluster. Dịch vụ heartbeat (trung tâm) cung cấp các tính năng cơ sở cần thiết cho hệ thống HA.

Cấu hình cluster phổ biến nhất là standby, sẽ được mô tả dưới đây. Trong cấu hình cluster này, một nút thực hiện tất cả các việc, còn các nút khác ở trạng thái nghỉ ngơi. Heartbeat giám sát “sức khoẻ” của từng dịch vụ cụ thể, thông thường qua một giao diện Ethernet phân tách vốn chỉ dùng cho hệ thống HA sử dụng câu lệnh đặc biệt ping. Nếu vì một lý do nào đó, nút đang thực hiện bị hỏng, heartbeat sẽ chuyển tất cả thành phần HA sang nút khoẻ mạnh khác. Khi nút cũ phục hồi, nó có thể khôi phục lại tình trạng cũ trước đó của mình.

Cài đặt và cấu hình

Để kiểm tra hệ thống High Availability Linux, bạn cần một bộ điều hợp Ethernet thứ hai trên từng nút dành cho heartbeat. Cài đặt Web server Apache và chương trình heartbeat phải trên cả hai nút. Nếu gói heartbeat không nằm trong bất kỳ phần lưu trữ của các phân phối đã có, bạn có thể download tại đây. Trên server CentOS, tôi dùng yum để cài đặt phần mềm cần thiết:

yum install -y httpd heartbeat

Các file cấu hình cho hearbeat không nằm ở nơi phần mềm được cài đặt. Bạn cần copy chúng từ thư mục documentation tới thư mục /etc/ha.d/:

cp /usr/share/doc/heartbeat*/ha.cf /etc/ha.d/
cp /usr/share/doc/heartbeat*/haresources /etc/ha.d/
cp /usr/share/doc/heartbeat*/authkeys /etc/ha.d/

Trong file /etc/hosts bạn phải bổ sung thêm tên hostname và địa chỉ IP để hai nút có thể giao tiếp được với nhau. Trong trường hợp của tôi sẽ như thế này:

192.168.1.1 node1.example.com node1
192.168.1.2 node2.example.com node2

Phải đảm bảo file /etc/hosts là giống nhau ở cả hai nút. Sau đó ping chúng, copy file từ nút này tới nút khác, sử dụng câu lệnh copy an toàn:

scp /etc/hosts root@node2:/etc/

Tiếp theo, chỉnh sửa file cấu hình /etc/ha.d/ha.cf theo các điểm vào để heartbeat có thể hoạt động:

logfile /var/log/ha-log # chỗ để log mọi thứ từ heartbeat
logfacility local0      # tiện ích sử dụng cho syslog hoặc logger
keepalive 2             # thời gian giữa các heartbeat
deadtime 30             # thời gian đến khi host được đưa ra ‘chết’
warntime 10            # thời gian trước khi cung cấp cảnh báo chậm trễ “late heartbeat”.
initdead 120            # thời gian chết đầu tiên (initdead)
udpport 694             # cổng udp cho truyền thông bcast hoặc ucast
bcast eth1              # giao diện quảng bá (broadcast)
ucast eth1 10.0.0.1    # cluster 2 nút, vì thế không cần dùng dạng đa quảng bá (multicast)
auto_failback on        # tự động gửi tài nguyên sai trở lại nút chính
node node1.example.com  # tên của nút đầu tiên
node node2.example.com  # tên của nút thứ hai

Đây là các tuỳ chọn cơ sở cần thiết để heartbeat hoạt động. File này phải được cấu hình giống hệt nhau trên cả hai nút, ngoại trừ phần “ucast” (nơi để địa chỉ IP của hàng để gửi các gói tới).

File tiếp theo là /etc/ha.d/haresources. Trong file này bạn cần định nghĩa tên nút chính, địa chỉ IP ảo (cluster IP) và tài nguyên dùng để bắt đầu. Ở trường hợp của chúng ta thì đó là Web server Apache.

Chỉ cần một dòng dữ liệu ở đây:

node1.example.com 192.168.1.5 httpd

Hãy chắc chắn rằng file này giống nhau hoàn toàn trên cả hai nút. Chú ý tên tài nguyên là tên script khởi tạo đặt trong thư mục /etc/init.d. Nếu tên tài nguyên trong /etc/init.d không hoàn toàn giống nhau, heartbeat sẽ không thể tìm thấy khi cố gắng đọc nó và cả Apache lẫn hearbeat đều không thể khởi động được.

File liên quan đến heartbeat cuối cùng là /etc/ha.d/authkeys. File này cũng phải hoàn toàn giống nhau trên cả hai nút và chỉ được đọc hoặc ghi bởi người dùng root. Nếu quyền hạn bị thiết lập khác đi, heartbeat sẽ từ chối khởi động. Bạn cần phải cấu hình file như thế này:

auth 1
1 crc

và giới hạn quyền đọc hoặc ghi chỉ dành cho người dùng root:

chmod 600 /etc/ha.d/authkeys

Bây giờ là cấu hình Apache service. Chúng ta muốn Apache nghe địa chỉ IP ảo 192.168.1.5 và cần trỏ thư mục Apache gốc tới điểm cài đặt dữ liệu /data, nơi các file Web được lưu lại. Chú ý là nơi lưu trữ dành cho Apache có thể là một nơi cụ thể từ thư mục file hệ thống cục bộ tới mạng khu vực lưu trữ. Tất nhiên nếu dữ liệu trong cả hai nút không giống nhau thì sẽ chẳng có điểm nào trong cluster failover cả. Nếu bạn không có thiết bị lưu trữ mạng mở rộng (như Fibre Channel chẳng hạn), bạn có thể cài đặt bất kỳ file hệ thống nào như SMB, NFS, iSCSI, hoặc SAN là thư mục địa phương để dữ liệu có thể được truy cập trên từng nút khi các nút hoạt động. Điều này được thực hiện bằng cách chỉnh sửa các thông số đầu vào trong file /etc/httpd/conf/httpd.conf như sau (ít nhất là trên phân phối CentOS):

Listen 192.168.1.5:80
DocumentRoot “/data”

Bỏ chức năng bắt đầu tự động khởi động trong thời gian boot cho Apache service là khá quan trọng. Khi đó heartbeat sẽ bắt đầu và kết thúc dịch vụ khi cần thiết. Vô hiệu hoá chức năng bắt đầu với câu lệnh sau (trên hệ thống Red Hat):

chkconfig httpd remove

Phải đảm bảo cấu hình Apache giống nhau trên cả hai nút.

Kiểm tra

Bây giờ chúng ta sẽ kiểm tra thử chương trình làm việc với cấu hình vừa thiết lập, bắt đầu tạo cluster mới, khởi động dịch vụ hearbeat trên cả hai nút:

/etc/init.d/heartbeat start

Xem thư mục /var/log/ha-log trên cả hai nút. Nếu tất cả được cấu hình chính xác, bạn sẽ thấy thông tin trong các file log như sau:

Configuration validated. Starting heartbeat 1.2.3.cvs.20050927
heartbeat: version 1.2.3.cvs.20050927
Link node1.example.com:eth1 up.
Link node2.example.com:eth1 up.
Status update for node node2.example.com: status active
Local status now set to: ‘active’
remote resource transition completed.
Local Resource acquisition completed. (none)
node2.example.com wants to go standby [foreign]
acquire local HA resources (standby).
local HA resource acquisition completed (standby).
Standby resource acquisition done [foreign].
Initial resource acquisition complete (auto_failback)
remote resource transition completed.

Tiếp theo là kiểm tra failover, khởi động lại server master (server chính). Server slave (server phụ) là dịch vụ Apache. Nếu mọi thứ hoạt động tốt bạn sẽ thấy như sau:

Received shutdown notice from ‘node1.example.com’.
Resources being acquired from node1.example.com.
acquire local HA resources (standby).
local HA resource acquisition completed (standby).
Standby resource acquisition done [foreign].
Running /etc/ha.d/rc.d/status status
Taking over resource group 192.168.1.5
Acquiring resource group: node1.example.com 192.168.1.5 httpd
mach_down takeover complete for node node1.example.com.
node node1.example.com: is dead
Dead node node1.example.com gave up resources.
Link node1.example.com:eth1 dead.

Khi master online trở lại, thông số trên Apache service như sau:

Heartbeat restart on node node1.example.comheartbeat
Link node1.example.com:eth1 up.
node2.example.com wants to go standby [foreign]
standby: node1.example.com can take our foreign resources
give up foreign HA resources (standby).
Releasing resource group: node1.example.com 192.168.1.5 httpd
Local standby process completed [foreign].
remote resource transition completed.
Other node completed standby takeover of foreign resources.

Kết luận

Đó là tất cả các bước để xây dựng một Web server cluster mang tính sẵn sàng cao với chi phí thấp. Tất nhiên có nhiều sản phẩm thương mại khác cũng được cung cấp với cùng mục đích này. Nhưng với các doanh nghiệp nhỏ hay các tổ chức tương tự thì High Availability Linux và heartbeat là một lựa chọn sáng suốt.

 

Blog at WordPress.com.