Piranha for LVS

食人鱼(Piranha)是南美洲食肉的淡水。它们通常15—25厘米(6—10英寸)长,虽然有时发现有长度达到40厘米的。食人鱼具有尖利的牙齿(能够轻易咬断用造的鱼钩或是一个人的手指[1]),非常凶猛,一旦发现猎物,往往群起而攻之。可以10分鐘內將一隻活牛吃剩一排白骨。当地人用它们的牙齿来做工具和武器。亚马逊河圭亚那河巴拉圭河、等河流是食人鱼经常出没的场所。

piranha,水虎鱼,食人鱼

piranha,水虎鱼,食人鱼

piranha,水虎鱼,食人鱼

piranha,水虎鱼,食人鱼

piranha:

Summary     : Cluster administation tools
Description : Various tools to administer and configure the Linux Virtual Server as well as heartbeating and failover components.  The LVS is a dynamically adjusted kernel routing mechanism that provides load balancing primarily for web and ftp servers though other services are supported.

piranha的组件:

/usr/sbin/pulse
heartbeating daemon for monitoring the health of cluster nodes.

/usr/sbin/lvsd
daemon to control the Red Hat clustering services.

/usr/sbin/nanny
tool to monitor status of service in a cluster.

/usr/sbin/fos
failover services daemon to control the Red Hat clustering service.

/usr/sbin/send_arp
tool  to  notify network of a new IP address / MAC address mapping.
这个工具非常有用。之前有一篇《send_arp的一个角本,我曾经简单介绍过。

On LVS router, there are three service which need to be set to activate at boot time.

  • piranha-gui
  • pulse
  • sshd

If you are clustering multi-port services or using firewall marks, you must enable the iptables service.

Clustering of CentOS 5.2

http://www.centos.org/docs/5/html/5.2/Cluster_Administration/

CentOS Linux 5.2 (i386) 中的Clustering组件中包含如下组件:

yum groupinfo Clustering
Group: Clustering
Description: Clustering Support.
Default Packages:
clustermon
conga-devel
ricci
system-config-cluster
ipvsadm
piranha
cluster-snmp
modcluster
ricci-modcluster
cluster-cim
rgmanager
luci

实际安装过程:
Installing:

cluster-cim                          : CentOS Cluster Suite - CIM provider
cluster-snmp                       : CentOS Cluster Suite - SNMP agent
luci                                       : Remote Management System - Management Station
piranha                                 : Cluster administation tools
rgmanager                            : Open Source HA Resource Group Failover for CentOS
ricci                                      : Remote Management System - Managed Station
system-config-cluster         : system-config-cluster is a utility which allows you to manage cluster configuration in a graphical setting.
Installing for dependencies:
cman : cman - The Cluster Manager
gnome-python2-canvas : Python bindings for the GNOME Canvas.
httpd : Apache HTTP Server
ipvsadm : Utility to administer the Linux Virtual Server
lm_sensors : Hardware monitoring tools.
modcluster : CentOS Cluster Suite - remote management
net-snmp : A collection of SNMP protocol tools and libraries.
net-snmp-libs : The NET-SNMP runtime libraries.
openais : The openais Standards-Based Cluster Framework executive and APIs
perl-Net-Telnet : Net-Telnet Perl module
perl-XML-LibXML : XML-LibXML Perl module
perl-XML-LibXML-Common : XML-LibXML-Common Perl module
perl-XML-NamespaceSupport : XML-NamespaceSupport Perl module
perl-XML-SAX : XML-SAX Perl module
php : The PHP HTML-embedded scripting language. (PHP: Hypertext Preprocessor)
php-cli : Command-line interface for PHP
php-common :  Common files for PHP
pygtk2-libglade : A wrapper for the libglade library for use with PyGTK
python-imaging : Python’s own image processing library
tix :A set of extension widgets for Tk
tk :Tk graphical toolkit for the Tcl scripting language
tkinter :A graphical user interface for the Python scripting language.
tog-pegasus :OpenPegasus WBEM Services for Linux

Command Line Administration Tools
In addition to Conga and the system-config-cluster Cluster Administration GUI, command line tools are available for administering the cluster infrastructure and the high-availability service management components. The command line tools are used by the Cluster Administration GUI and init scripts supplied by Red Hat. Table 1.1, “Command Line Tools” summarizes the command line tools.

Command Line Tool Used With Purpose
ccs_tool — Cluster Configuration System Tool Cluster Infrastructure ccs_tool is a program for making online updates to the cluster configuration file. It provides the capability to create and modify cluster infrastructure components (for example, creating a cluster, adding and removing a node). For more information about this tool, refer to the ccs_tool(8) man page.
cman_tool — Cluster Management Tool Cluster Infrastructure cman_tool is a program that manages the CMAN cluster manager. It provides the capability to join a cluster, leave a cluster, kill a node, or change the expected quorum votes of a node in a cluster. For more information about this tool, refer to the cman_tool(8) man page.
fence_tool — Fence Tool Cluster Infrastructure fence_tool is a program used to join or leave the default fence domain. Specifically, it starts the fence daemon (fenced) to join the domain and kills fenced to leave the domain. For more information about this tool, refer to the fence_tool(8) man page.
clustat — Cluster Status Utility High-availability Service Management Components The clustat command displays the status of the cluster. It shows membership information, quorum view, and the state of all configured user services. For more information about this tool, refer to the clustat(8) man page.
clusvcadm — Cluster User Service Administration Utility High-availability Service Management Components The clusvcadm command allows you to enable, disable, relocate, and restart high-availability services in a cluster. For more information about this tool, refer to the clusvcadm(8) man page.

分布式、集群文件系统小结

顺序不分先后:

Lustre
Lustre is a scalable, secure, robust, highly-available cluster file system. It is designed, developed and maintained by Sun Microsystems, Inc.
Designed to meet the demands of the world’s largest high-performance compute clusters, the Lustre file system redefines scalability and provides groundbreaking I/O and metadata throughput. An object-based cluster, Lustre currently supports tens of thousands of nodes, petabytes of data, and billions of files — and development is underway to support one million nodes, trillions of files, and zetta to yotta bytes.
http://www.sun.com/software/products/lustre/
http://wiki.huihoo.com/index.php?title=Lustre

AFS
AFS Reference Page

OpenAFS
What is AFS?
AFS is a distributed filesystem product, pioneered at Carnegie Mellon University and supported and developed as a product by Transarc Corporation (now IBM Pittsburgh Labs). It offers a client-server architecture for file sharing, providing location independence, scalability and transparent migration capabilities for data. OpenAFS is the Transarc source code released as it looked like around AFS3.6 under IBM Public License IPL.

Arla
Arla is a free AFS implementation.
The main goal is to make a fully functional client with all capabilities of AFS as formerly sold by Transarc and today available as OpenAFS. Other stuff, such as servers and management tools are being developed, but currently not considered stable.

Coda
Coda分布式文件系统:http://www.bsdmap.com/diary/coda.php
Coda File System http://www.coda.cs.cmu.edu/
Coda is a forked of version of AFS that support disconnected and weakly connected mode better then AFS.

InterMezzo
InterMezzo is a new distributed file system with a focus on high availability. InterMezzo will be suitable for replication of servers, mobile computing, managing system software on large clusters, and for maintenance of high availability clusters.

xFS
xFS is a Serverless Network File Service.

CFS
Cluster File Systems, Inc. is the leading developer of next generation technology for scalable high-performance file systems. Our Lustre® file system redefines scalability and has been designed from the ground up to meet the demands of the world’s largest high-performance computer clusters.

GlusterFS
GlusterFS is a cluster file-system capable of scaling to several peta-bytes. It aggregates various storage bricks over Infiniband RDMA or TCP/IP interconnect into one large parallel network file system. GlusterFS is based on a stackable user space design without compromising performance.

Scalable File Share
HP StorageWorks Scalable File Share
A high-bandwidth, scalable storage appliance for Linux clusters
http://h20311.www2.hp.com/HPC/cache/276636-0-0-0-121.html

MogileFS
MogileFS is our open source distributed filesystem. Its properties and features include:
-1. Application level
-2. No single point of failure
-3. Autumaic file replication
-4. “Better than RAID”
-5. Flat Namespace
-6. Shared-Nothing
-7. No RAID required
-8. Local filesystem agnostic

Hadoop
The Apache Hadoop project develops open-source software for reliable, scalable, distributed computing, including:
* Hadoop Core, our flagship sub-project, provides a distributed filesystem (HDFS) and support for the MapReduce distributed computing metaphor.
* HBase builds on Hadoop Core to provide a scalable, distributed database.
* ZooKeeper is a highly available and reliable coordination system. Distributed applications use ZooKeeper to store and mediate updates for critical shared state.

PVFS
http://www.pvfs.org/
http://www.parl.clemson.edu/pvfs/
PVFS is designed to provide high performance for parallel applications, where concurrent, large IO and many file accesses are common. PVFS provides dynamic distribution of IO and metadata, avoiding single points of contention, and allowing for scaling to high-end terascale and petascale systems.

GFS

http://en.wikipedia.org/wiki/Global_File_System
http://www.redhat.com/docs/manuals/csgfs/
GFS (Global File System) is a cluster file system. It allows a cluster of computers to simultaneously use a block device that is shared between them (with FC, iSCSI, NBD, etc…). GFS reads and writes to the block device like a local filesystem, but also uses a lock module to allow the computers coordinate their I/O so filesystem consistency is maintained. One of the nifty features of GFS is perfect consistency — changes made to the filesystem on one machine show up immediately on all other machines in the cluster.

See also

External links About GFS

1. HP OpenVMS
————–
The first to work with a CFS is HP OpenVMS. Oracle Parallel Server and RAC always used
the OpenVMS filesystem (RMS) for its database.

2 HP Tru64
————
CFS is a layer on top of Advfs the filesystem of HP Tru64. Oracle uses
the Direct I/O feature available in CFS. Direct I/O enables Oracle to bypass
the buffer cache (no caching at filesystem level). Oracle manages the
concurrent access to the file itself; as it does on raw devices. On CFS,
without Direct I/O enabled on files - file access goes through a CFS server.
A CFS server runs on a cluster member and serves a file domain. A file
domain can be relocated from one cluster member to another cluster member
online. A file domain may contain one or more filesystems.

Direct I/O does not go through the CFS server, but file creation and resizing
is seen as metadata operation by advfs and this has to be done by the CFS
server.  The consequence is to run file creations and resizing on the node
where the CFS server is located. File operations might take longer when the
CFS server is remote.

Oracle recommends not using the tempfile option, as tempfiles might not be
allocated until the tempfile blocks are accessed and so cause
‘remote metadata operations’ for advfs.

3 Veritas
———–
VERITAS Database EditionTM / Advanced Cluster for Oracle9i RAC enables Oracle
to use the CFS.  The VERITAS Cluster File System is an extension of the VERITAS
File System (VxFS).  Veritas CFS allows the same filesystem to be simultaneously
mounted on multiple nodes.  Veritas CFS is designed with a master/slave
architecture.  Any node can initiate a metadata operation (create, delete, or
resize data), the actual operation is carried out by the master node. All other
(non metadata) IO goes directly to the disk.

CFS is used in DBE/AC to manage a filesystem in a large database environment.
When used in DBE/AC for Oracle9i RAC, Oracle accesses data files stored on CFS
filesystems by bypassing the filesystem buffer and filesystem locking for data.

4 Oracle Cluster File System
——————————
Oracle Cluster File System (OCFS) is a shared filesystem designed specifically
for Oracle Real Application Clusters. OCFS eliminates the requirement for Oracle
database files to be linked to logical drives and enables all nodes to share a
single Oracle_Home (current capabilities are detailed in section 2.8) instead
of requiring each node to have its own local copy. OCFS volumes can span one
shared disk or multiple shared disks for redundancy and performance
enhancements.

5. Netapp(R) Filer
——————-
Netapp Filer offers CFS functionality via NFS to the server machines. These
filesystems are mounted using special mount options. For details please see
Netapp documentation.

Netapp certifications can be found at:

http://www.netapp.com/part…

To understand the architecture and Oracle installation please see these
documents:

Note 210889.1: RAC Installation with a NetApp Filer in Red Hat Linux Environment
and
Oracle9i RAC Installation with a NetApp Filer on Fujitsu-Siemens Primepower
(Solaris8 Operating System) at http://www.netapp.com/tech…

6 AIX
——-
IBM’s General Parallel File System (GPFS) allows users shared access to files
that may span multiple disk drives on multiple nodes. GPFS provides access to
all data from all nodes of the cluster.  It can be configured with multiple
copies of metadata allowing continued operation should the paths to a disk or
the disk itself be broken. Metadata is the filesystem data that describes
the user data.  GPFS allows the use of RAID or other hardware redundancy
capabilities to enhance reliability.

In Oracle9i GPFS is only supported with HACMP/ES in a RAC configuration.
When placing datafiles on GPFS no CRM (Concurrent Resource Manager) needs to be
installed. Starting with Oracle10g HACMP is no longer required to use GPFS.

Metalink contains certification information and information about required
patches for having a cluster database on a GPFS.

7 Sun GFS
———–
Global File Service (GFS or Cluster File System) is a filesystem that is
accessible from all nodes in the cluster. GFS is based on global devices and
has a client/server architecure. GFS provides transparent and concurrent file
access.

Note that Sun GFS is not supported for Oracle datafiles, see section 3.10.

8 Sun StorEdge QFS
——————–
QFS software is a file manager that provides a shared filesystem where mutiple
servers can read and write simultanuously to the same file in the same filesystem.

9 Other Linux Cluster Filesystems
———————————–
There are various third party cluster filesystems available on Linux.
Consult the Oracle Certify website for the policy regarding support for third party
cluster file systems on Linux. Also, consult the RAC Technology Compatibility Matrix (RTCM)
for Linux (http://www.oracle.com/tech… … generic_linux.html)
for the latest information on which third party cluster file systems are supported
by RAC release and platform.

10 Which Platforms support what?
———————————-

Platform and                         Storage for                      Storage for
[Cluster Software]                Oracle installation             datafiles

AIX [HACMP]                          LFS (1) or CFS (2)           CFS and/or Raw devices
AIX [CRS]                                  LFS or CFS            CFS and/or Raw devices
HP/UX [MC/Service Guard]             LFS or CFS (3)        CFS (3) and/or Raw Devices
HP/UX PA-Risc [Veritas DBE/AC)       LFS or CFS            CFS and/or Raw Devices
Linux [oracm, CRS]                   LFS                   OCFS (4) and/or Raw
Devices, also NFS (5)
OpenVMS                              CFS                   CFS
Sun Solaris [Fujitsu Siemens         LFS                   Raw Devices/NFS (5)
Primecluster]
Sun Solaris [Sun Cluster]            LFS or CFS        (6,7)             CFS (7) Raw Devices/NFS (5)
Sun Solaris [Veritas DBE/AC]         LFS or CFS                         CFS and/or Raw Devices
Tru64 Unix                           LFS or CFS                         CFS and/or Raw Devices
Windows NT/2000 [oracm, CRS]         LFS or CFS                   OCFS and/or Raw Devices
Windows 2003 (32/64bit) [oracm, CRS] LFS or CFS            OCFS and/or Raw Devices

(1) LFS is the abbreviation for local filesystem and is only accessible directly
by the node that mounted the disk
(2) CFS is the abbreviation for Cluster FileSystem. The implementation
depends on the operating software vendor or cluster software vendor.
(3) MC ServiceGuard 11.17 includes a CFS which is supported with Oracle 10gR2
(4) OCFS: Oracle Cluster FileSystem
(5) NFS is supported with Netapp(R) Filer, see Metalink certification
(6) Sun GFS can only be used for Oracle_Home and archivelogs.
(7) Sun StorEdge QFS

Local Filesystem means that the Oracle Universal Installer replicates the
RAC software installation automatically to every private filesystem of the
selected nodes in the cluster. The Oracle installation products
are cluster aware and will not install the Oracle software to over-write itself.

Oracm is the Oracle Cluster manager, which is available on Linux and Windows
NT/2000. No other cluster manager is needed to setup Real Application Cluster.

Cluster Ready Services (CRS) are new in Oracle10g and provide also clustermanager
functionality.

Oracle will validate cluster filesystems of other vendors when they become
available. Oracle will support the Oracle software when running on a validated
cluster filesystem.

11 Cluster File System names
——————————

PLatform or Cluster Vendor        CFS name

AIX                                                  GPFS
HP/UX MC/ServiceGuard       CFS
Linux [oracm, CRS]                OCFS
OpenVMS                                              RMS
Tru64 Unix                                    CFS
SunCluster                  GFS, QFS
Veritas DBE/AC                                CFS
Windows NT/2000                                OCFS
Windows 2003 (32/64bit)           OCFS

For more information on certified configuration please see the certification
matrix available on Metalink.  Instructions for accessing the certification
matrix can be found in the following note:

Note 184875.1
How To Check The Certification Matrix for Real Application Clusters

12 When to use CFS over raw?
——————————
This option is very dependent on the availability of a CFS on your platform.
A CFS offers:
- Simpler management
- Use of Oracle Managed Files with RAC
- Single Oracle Software installation
- Autoextend enabled on Oracle datafiles
- Uniform accessibility to archive logs in case of physical node failure
- With Oracle_Home on CFS, when you apply Oracle patches CFS guarantees that
the updated Oracle_Home is visible to all nodes in the cluster.

nginxctl

自已仿照写了一个nginx的控制角本,运行在我们的CentOS Linux系统上,使用良好。代码如下:

#!/bin/bash
# Author  : Cao Yuwei
# MSN     :
# QQ       :
# E-Mail   :

# master process
# TERM,INT  shutdown fast
# QUIT         shutdown graceful
# HUP           reload config
# USR1          reopen log file
# USR2          update nginx bin file graceful
# WINCH       shutdown worke prcess graceful

# work process
# TERM,INT   shutdown fast
# QUIT          shutdown graceful
# USR1          reopen log file

PATH=/sbin:/usr/sbin:/usr/local/sbin:/bin:/usr/bin:/usr/local/bin

# Source function library.
. /etc/rc.d/init.d/functions

# Source networking configuration.
. /etc/sysconfig/network

# Check that networking is up.
if [[ $NETWORKING == [Nn][Oo] ]]; then exit 0; fi

start() {

echo -n ‘Starting Nginx: ‘
#这里要指定你的nginx的路径: /opt/nginx/sbin/nginx
daemon /opt/nginx/sbin/nginx “$EXTRAOPTIONS”
local RETVAL=$?
echo
if [ $RETVAL -eq 0 ]; then touch /var/lock/subsys/nginx ; fi
return $RETVAL
}

stop() {
echo -n ‘Shutting down Nginx: ‘
killproc nginx
local RETVAL=$?
case nptl in
[Ll]inux[Tt]hreads*|lt*)
# Wait until all threads have terminated.
local -i count=20
while [[ count -gt 0 ]] && pidof nginx > /dev/null
do
usleep 200000
let –count
done
;;
esac
echo
if [ $RETVAL -eq 0 ]; then rm -f /var/lock/subsys/nginx; fi
return $RETVAL
}

restart() {
stop
start
}

relog() {
echo -n ‘Relog Nginx: ‘
killproc nginx -USR1
local RETVAL=$?
echo
return $RETVAL
}

reload() {
echo -n ‘Reload Nginx: ‘
killproc nginx -HUP
local RETVAL=$?
echo
return $RETVAL
}

check() {
/opt/nginx/sbin/nginx -t
}

#
#       See how we were called.
#
case “$1″ in
start)
start
;;
stop)
stop
;;
reload)
reload
;;
restart)
restart
;;
relog)
relog
;;
check)
check
;;
*)
echo $”Usage: nginxctl {start|stop|restart|reload|relog|check}”
exit 1
esac

exit

SELinux HOWTO 中文PDF

SELinux HOWTO

PAM (Linux)

#%PAM-1.0
#root可以不经过认证直接运行su
auth sufficient pam_rootok.so
#wheel组的账号可以不经过认证运行su
#auth sufficient pam_wheel.so trust use_uid
#wheel组的账号才可以运行su
auth required pam_wheel.so use_uid
auth include system-auth
account sufficient pam_succeed_if.so uid = 0 use_uid quiet
account include system-auth
password include system-auth
session include system-auth
session optional pam_xauth.so

搜狗五笔升级了

还是同一个朋友。

今天搜狗五笔升级了。很好,很强大。输入法对我,终于有一家可以一统天下。

下载最新版的搜狗五笔去 http://wubi.sogou.com

LAMP架构中的关键瓶颈在哪里

以下为引用:

我的想法源起于这样一个事情,有一次一个网站的技术总监问我,为什么他们的网站那么慢,要怎么办。当时,我的MSN里Zend总部的工程师正好在线,我就 问他PHP响应比较慢了,怎么办?他当时直接告诉我,数据库问题!肯定是数据库没有优化设计好。所以,我没有给那个技术总监确切的答案了,因为他们的数据 库设计我们是不能涉及的。所以就给了大概的数据库优化的建议。这样的事情屡次发生,我就开始怀疑,为什么Zend总部的工程师每次都跟我 说是数据库的问题呢,难道我们不能从PHP层面来解决这个问题吗?答案是不能!因为PHP目前的运行速度已经是很快了,通过Zend的性能分析也能看到一 个用户的点击,PHP的运行时间只有10%不到,那PHP在干吗?它在等。等数据库的查询结果。这个方面在目前的PHP产品中有了很大 的提高,那就是Caching和网页静态化两个方案。Caching可能大家会比较陌生,但是网也静态化现在连PHP产品的用户都非常清楚了。速度快、容 易被搜索到等等,好处不言而喻。开玩笑地说,现在网站的主页实现网页静态化只需要硬盘足够大。至于Caching就比较复杂些,也是大多数PHPer感到 头疼的地方。甚至于有些人会用C来实现。因为Caching中的数据有效期验证、查找、提取、更新等等都是比较难处理。当然,也有人会用数据库来处理 Caching问题。

chrome——谷歌浏览器

Google推出自己的浏览器了,命名为chrome,下载地址为:http://tools.google.com/chrome/

乍一看,UI不错,用起来也挺顺手。但是总觉得缺少特色,放出来早了?

更多关于chrome的信息

看看官方博客的声明:

我们通过漫画形式提前发布了一款开源浏览器Google Chrome,相信大家已经通过博客渠道了解到该消息。周二,我们将在全球100多个国家推出Google Chrome测试版。

为什么要开发Google Chrome呢?因为我们相信它能给用户带来更多价值,同时有利于推动Web创新。

在谷歌,大量的工作都是通过浏览器进行的,搜索、聊天、收发邮件和协同开发等。在空闲时间,我们通过浏览器购物,登录网络银行,读新闻,与好友交流等。每天花费在浏览器上的时间如此之多,我们不得不考虑什么样的浏览器最适合当前的Web发展趋势。如今,网页已经从简单的文本页面发展到富媒体页面,这就需要我们重新设计网络浏览器。我们真正需要的不仅仅是一款浏览器,而是一个现代化的网页及应用平台。这就是我们开发Google Chrome的初衷。

从外观即可看出,Google Chrome的设计简单、高效,是一款真正的Web浏览工具。与谷歌主页一样,Google Chrome的特点是简洁、快速。

Google Chrome支持多标签浏览,每个标签页面都在独立的“沙箱”内运行,在提高安全性的同时,一个标签页面的崩溃也不会导致其他标签页面被关闭。Google Chrome基于更强大的JavaScript V8引擎,这是当前Web浏览器所无法实现的。

当然,这只是一个开始,Google Chrome在很多方面还需要进一步完善。此次,我们即将推出的是Windows下的测试版本,供大家讨论,我们也希望能够得到用户的反馈。目前, Mac和Linux版本尚在开发之中,同样将秉承快速、高效的特点。

Google Chrome是一款开源软件,借鉴了苹果的WebKit、Mozilla的Firefox及其他相关应用。同样,我们也将开放Google Chrome的全部源代码。我们期望与整个开源社区合作,共同推动Web创新。

在当前Web市场,选择和创新越来越多,我们希望Google Chrome能成为一个新选择,推动Web服务更上一层楼。

为什么说PHPer是草根开发者

开篇注释:以下文字并没有非常多的技术词汇,所以只要对PHP感兴趣的人都可以看看。

PHPer是草根吗?

从PHP诞生之日起,PHP就开始在Web应用方面为广大的程序员服务。同时,作为针对Web开发量身定制的脚本语言,PHP一直秉承简单、开源的思想,这也使得PHP得以快速的发展,并且大力地推动Web2.0的出现与发展。但是,长期以来,PHPer(PHP Programmers)被认为是处于草根阶层的程序员,被认为是技术含量少,层次低的程序员。这点在国内尤其突出。

记得一个技术主管说过这样一个事情。他给一个程序员分配了PHP的开发任务,没想到那个程序员居然说:“我是学Java出身的,你让我去写PHP,你这不是在贬低我吗?”。这件事情给我印象很深、触动也很大。虽然这不能代码大部分程序员的看法,但是这么认为的人应该不少。还有人说,现在如果是大型的政府项目,PHP是肯定不会被列入考虑的范围之内的。

那么为什么PHPer会被认为是草根阶层,是因为它很简单,人人都可以学会,所以没什么难度吗?我以前也是这么认为。PHP入门很快,处理文件,数据,远程连接,网络编程都非常方便,官方也有这样的说法:PHP学习的成本很低,所以你容易去使用它。这个想法也是普遍的,甚至大部分的PHPer自己都这样认为。

说到这里,我想大家就会想到我为什么要写这些文字。因为一年多的PHP推广工作让我了解到许许多多的使用PHP的公司的大概情况。在这些过程中我慢慢体会到其中的根本原因。这里我说是根本原因虽然是个人的看法,但是我觉得事实就是如此。

那么为什么PHPer会被看成草根阶层,根本原因是PHPer所作的事情(通过代码实现)的绝大部分都是表现层的东西,这个熟悉PHP的人都知道。当然也会有PHP会说他用MVC结构编写的某某框架具备的如何如何的功能。但是这些还是表现层。所以只会处理表现层的程序员就被看成草根阶层了。事实上也是如此,因为这种情况下PHP确实很难构造大型的应用。

这就找到原因了,不是的。为什么PHPer总是在负责表现层的东西呢。答案是底层的数据处理(Web应用就是数据存储和查找)我们一般不去触及!好,那么说到这里有些人可能已经想到了,那不就是数据库吗!对,就是数据库!让PHPer一直当草根的元凶就是数据库。为什么?

因为目前流行的web架构中,前端是负载均衡系统,中间是web服务器,后面是数据库服务器。所以,大部分PHPer工作在Web服务器层面。因为数据库已经很好地为我们组织数据了。所以PHP中没有太多的算法,而且大家潜意识下也觉得不需要,更何况会影响性能。

这种情况下,PHPer就成为了数据库使用者,他总是在操作数据库。而不是在做程序。一个最简单的PHP脚本就是,连接数据库,把数据取出来,然后用命令输出到浏览器。整个过程不超过10行代码。给人的感觉就是太简单了。没有任何技术含量。为什么了,因为数据处理部分都已经被数据库做完了。尤其是MySQL的使用!MySQL是免费的,所以大多数程序员可以自由地使用它,另外MySQL的速度够快了,所以做个PHP应用程序非常的简单。这就相当于给你枪以后你觉得没有必要学习武功一样。当然,我不是说枪没有武工好。而是说,枪的出现,小孩都可以轻松便捷地杀人了。

我们再详细说说为什么是数据库!这里我说一个例子。我去过北京一家非常著名的网站,当时我们还有一个比较资深的PHP程序员在那说些系统架构的事情。我记得当时那个程序员问大家一个数据结构中的算法问题的时候,全场没有一个人能答得出来(包括我)。然后那个程序员就开始给大家讲些很基础的数据结构的东西了。让我一下子回想到大学时候学的数据结构课。而这些基础的数据排序、查找、传递的问题在其他高级语言(比如C)是非常普遍的。但是在PHP没有!PHPchina.com的论坛也有个板块叫PHP的数据结构和算法。这个板块的帖子也是寥寥无几。

仔细回想下,目前网络上大家讨论的最多的是两个方面的问题。一个是PHP的类的使用(处理过程的封装),还有一个是开发框架问题。但是我们仔细分析的话,发现这些所谓的PHP中比较复杂的概念里面没有数据处理!为什么,有数据库!用一个Adodb或者PHP5的PDO就可以搞定了!真的搞定了吗?不是,这些无非是在连接数据库,没有数据处理!所以PHPer似乎就没有什么可以拿出台面上的东西。

再说一个具体的代码问题,无级分类。这个概念我想大家都不会陌生了吧。我见过两种处理方式。第一个是地道的PHPer的处理方式,也是目前比较流行的。就是用数据库来处理。而且字段很少,只需要加个父类的字段并加以判断就行了。而且这个方法很实用。效率也高!但是这个不是数据处理的范畴了,而是数据库的查找!

第二个是C程序员用PHP写出来的,他把所有的分类信息都从数据库取出来,然后用数据结构算法进行排列分布,然后输出。

这里我们不对这两种方式的效率进行对比,我想大家都有各自的想法。但是我想说明一个问题,就是这两种做法的本质的区别。PHPer习惯性地用数据库来处理,而且有很巧的处理方式,效率也很高!这种方式就是数据库查询。而第二种方法是比较有特点的。他认为数据库就是存放数据的地方,具体的逻辑处理还要靠自己的逻辑。

因此,结论是第二种方法的使用者觉得自己强些,因为数据的逻辑是他组织的!并且觉得PHPer的那种做法无非就是会查询数据库罢了。所以他认为PHPer是草根级的,只懂得操作数据库和排列页面(smarty搞搞那种)。

说到这里,我想大家都已经回忆了不少自己平时用PHP做开发的经历了吧,是否发现大家确实都在操作数据库呢。

那么我们来讨论下这个问题。数据库不好吗?为什么我一直用数据库处理数据都没有问题。我要说的是数据库是有问题的,而且有很大的问题!当然这里我并不是说不能用数据库,也不是在贬低数据库的性能。而是,我们没有充分认识到数据库所起到的作用。

我的想法源起于这样一个事情,有一次一个网站的技术总监问我,为什么他们的网站那么慢,要怎么办。当时,我的MSN里Zend总部的工程师正好在线,我就问他PHP响应比较慢了,怎么办?他当时直接告诉我,数据库问题!肯定是数据库没有优化设计好。所以,我没有给那个技术总监确切的答案了,因为他们的数据库设计我们是不能涉及的。所以就给了大概的数据库优化的建议。这样的事情屡次发生,我就开始怀疑,为什么Zend总部的工程师每次都跟我说是数据库的问题呢,难道我们不能从PHP层面来解决这个问题吗?答案是不能!因为PHP目前的运行速度已经是很快了,通过Zend的性能分析也能看到一个用户的点击,PHP的运行时间只有10%不到,那PHP在干吗?它在等。等数据库的查询结果。这个方面在目前的PHP产品中有了很大的提高,那就是Caching和网页静态化两个方案。Caching可能大家会比较陌生,但是网也静态化现在连PHP产品的用户都非常清楚了。速度快、容易被搜索到等等,好处不言而喻。开玩笑地说,现在网站的主页实现网页静态化只需要硬盘足够大。至于Caching就比较复杂些,也是大多数PHPer感到头疼的地方。甚至于有些人会用C来实现。因为Caching中的数据有效期验证、查找、提取、更新等等都是比较难处理。当然,也有人会用数据库来处理 Caching问题。

所以,当访问量激增的时候,PHP架构的网站会出现的很多问题都因数据库而起。数据库的同步问题还不算什么。关键是数据库的响应速度会有指数级的降低。这个问题我在10月23号LAMP发布会的时候问过MySQL的副总裁。他当时也没有给我比较完美的答案(这也我的意料之中),因为数据库总会有瓶颈的,除非是神仙数据库,哈哈!

这里有个题外话,LAMP大会的时候我跟Yahoo的一个技术高管聊的时候,我问他Yahoo在选择MySQL还是Oracle的时候是怎么考虑,他的答案令我非常惊讶。他说大部分的时候我们是会用MySQL的,因为它的性能已经达到我们的要求。但是什么时候我们会选用Oracle呢,就是当我们需要存储收费用户的数据的时候。我就问为什么,难道Oracle比MySQL稳定吗?他说,这个倒没有特别考虑。关键是如果使用Oracle的话,当出现问题的时候我们可以找到负责人,Oracle会负责事故的处理,但是如果用MySQL的话,我们找谁去?

所以,我们对数据库的看法应该纠正过来,就是说数据库不是万能的。如果有实力的话自己开发数据库。听说Google就是那样的。

那么我们怎么看待数据库呢?我个人的理解是数据库只是用来降低开发成本的手段。因为采用数据库以后我们不需要考虑数据的存储,尤其是排序和查找。但是这会带来什么问题呢?就是当业务膨胀的时候,数据库就成为瓶颈了!这个时候问题就会非常棘手!因为这个是底层的数据处理。牵一发而动全身。

所以我认为正确的观点是,数据库是一个数据备份机!怎么理解,我们只需要保证数据的存储有效性就行了。而这本来就是数据库的核心功能,只不过因为数据库的方便的排序等功能让大家把过多的处理都交给数据库来操作了。一个用户的点击PHP就把一大堆的任务交给数据库,然后把结果排列下给用户就完事了。这对数据库是不公平的!也是因此大家开始抱怨数据库的性能了。

针对这个观点,我们再举个例子,有一次我去拜访一个大型的网络公司(基本上国内只要上过互联网的都知道),他们使用PHP很少,但是我了解到他们其它业务是怎么使用数据库。他们自豪地跟我介绍说他们在数据库的外围有个第二数据库(我这里起名叫第二数据库)。为什么叫第二数据库呢,原来它是一个缓存系统。那么开发工程师怎么去这个缓存系统获取数据呢?那个技术总监自豪地说,他们这个缓存系统由SQL查询语句!我当时很惊讶,但是后来想想确实需要这个。因为当你的缓存系统达到一定量级的时候从缓存获取数据都非常复杂,干脆写个SQL查询语句让缓存系统分析、处理并返回数据。而且他们告诉我,在他们那里,就算是用PHP的话也是让PHP去那个缓存系统读取数据。

所以说,如果你能处理好这样的问题的话,把数据存放在数据库,然后数据库只起到备份的作用。然后你用自己的中间层来处理分析数据,效果是90%以上的用户访问不访问数据库。有人就会说了,这不就类似连接池的东西吗?是的!因为数据库的瓶颈是无法解决的,我们只能在Web服务器和数据库中间加个中间层来做缓冲。

可能大家会说了,切,这个我们早就知道了!那好,这里我要说的是它引发的两点思考:

第一, 有些语言已经有连接池技术的基础上,那些程序员可以方便地使用连接池而构建大型应用。那么如果他们认为PHPer只会是用数据库,那么我们是不是可以说他们只会是用连接池呢?连接池和数据库在这个概念上有何区别?

第二,当PHPer开始构建自己的缓存系统的时候,他是不是突破了PHPer只会是用数据库的层次?因为他参与了数据逻辑的处理工作。那么他还是草根吗?

最后,新一代的PHPer是草根吗?

原文链接:http://www.phpchina.com/?1/action_viewspace_itemid_2520.html