博客

关于FB这类极致的工程师文化

* very engineering driven culture. ”product managers are essentially useless here.” is a quote from an engineer. engineers can modify specs mid-process, re-order work projects, and inject new feature ideas anytime.

这个文化工程师看了都很爽,产品经理看了都想痛骂,哈

我觉得FB有这个文化是因为:对FB而言,压制创新的风险比项目失败的风险要大..

这是个2000人的团队——个人觉得一旦最高领导人不能叫出团队里每个人的名字不知道每个人都在什么项目上一定就开始盛行官僚主义(我心中理想的公司规模应该是50-150人之间)——如果不这么做,创新一定会被无边无际的流程条款压死;另外一方面,FB已经足够大,足够领先,同时进行的项目可能有20个,即使80%的项目延期了,仍然有4个新项目可以按时上线,即使100%失败了,短期内仍然不会影响FB大局

看你的团队现在是项目失败的风险大还是压制创新的风险大?这一条不是人人都可以学的。。。别忘了FB还有一个很严酷的绩效文化,项目总是延期工程师就得走人了

FB这个文化的大背景是硅谷文化。。。。咱们的背景是啥?酱缸文化

最后,Mark好像是大学肄业生吧,所以FB这个公司的气质和2位博士创建的Google的气质显然是不同滴... FB更偏工程,Google更偏研究

Topic: 商业 技术

facebook 的工程师文化

有人发表了How Facebook Ships Code,偶觉得其中关于Facebook的工程师驱动文化的部分特别有意思,于是翻译了一下(刚刚google之,网上也有其他翻译出来了,真是快手啊)..

* as of June 2010, the company has nearly 2000 employees, up from roughly 1100 employees 10 months ago. Nearly doubling staff in under a year!

截止2010年6月,fb有大概2000名员工,比之前的10个月,增加了将近1000名

* the two largest teams are Engineering and Ops, with roughly 400-500 team members each. Between the two they make up about 50% of the company.

最大的两个团队是工程团队,和运维团队,各有400-500名工程师

* product manager to engineer ratio is roughly 1-to-7 or 1-to-10

产品经理和工程师的比例大约是1:7到1:10之间

* all engineers go through 4 to 6 week “Boot Camp” training where they learn the Facebook system by fixing bugs and listening to lectures given by more senior/tenured engineers. estimate 10% of each boot camp’s trainee class don’t make it and are counseled out of the organization.

新入职的工程师大概会进行一个4-6周的BootCamp训练来熟悉fb,修补bug,以及学习来自资深工程师的训练课程;大概10%的新兵无法完成这个过程被劝退

* after boot camp, all engineers get access to live DB (comes with standard lecture about “with great power comes great responsibility” and a clear list of “fire-able offenses”, e.g., sharing private user data)

BootCamp后,所有的工程师就可以去访问生产系统(DB)了——这里有一个文化"给员工越多授权,他们的责任心越高"——以及一系列明确的不能去做的禁令,比如,公开用户私人信息

* [EDIT thx fryfrog] “There are also very good safe guards in place to prevent anyone at the company from doing the horrible sorts of things you can imagine people have the power to do being on the inside. If you have to “become” someone who is asking for support, this is logged along with a reason and closely reviewed. Straying here is not tolerated, period.”

* any engineer can modify any part of FB’s code base and check-in at-will

任何工程师可以修改FB代码库里的任何部分

* very engineering driven culture. ”product managers are essentially useless here.” is a quote from an engineer. engineers can modify specs mid-process, re-order work projects, and inject new feature ideas anytime.

极致的工程师文化——某工程师如此评价:"产品经理完全可以忽视鄙视无视"。流程执行到一半的时候工程师还能去修改规格,工程师有权利调整项目优先级,任何时刻插入自己新的idea

* during monthly cross-team meetings, the engineers are the ones who present progress reports. product marketing and product management attend these meetings, but if they are particularly outspoken, there is actually feedback to the leadership that “product spoke too much at the last meeting.” they really want engineers to publicly own products and be the main point of contact for the things they built.

在月度跨部门会议里,工程师负责做进度报告。产品营销和产品经理也会去参加这些会议,但如果他们particularly outspoken,领导层会得到这样的反馈"产品在上个会议讲的太多了"。这里期望工程师拥有产品并且成为他们项目的主角

* resourcing for projects is purely voluntary.
o a PM lobbies group of engineers, tries to get them excited about their ideas.
o Engineers decide which ones sound interesting to work on.
o Engineer talks to their manager, says “I’d like to work on these 5 things this week.”
o Engineering Manager mostly leaves engineers’ preferences alone, may sometimes ask that certain tasks get done first.
o Engineers handle entire feature themselves — front end javascript, backend database code, and everything in between. If they want help from a Designer (there are a limited staff of dedicated designers available), they need to get a Designer interested enough in their project to take it on. Same for Architect help. But in general, expectation is that engineers will handle everything they need themselves.
项目的资源完全来自工程师的自愿:
  • PM游说工程师们,试图吸引工程师为他们的想法而工作

  • 工程师自己决定去干哪个产品经理的活

  • 工程师然后去给他们的头儿报告:"我本周要干这么5件事情"

  • 工程师的头儿几乎可以说是放任手下各行其是,偶尔给点做事情优先级的忠告

  • 工程师自己处理所有的事情,从js到db的所有逻辑。如果他们期望得到设计师(FB里只有非常少的专职设计师)的帮助,他们需要自己去搞定设计师来加入他们的项目;如果需要架构师同样也得自己来搞定。但通常来说,工程师自己干所有的活

* arguments about whether or not a feature idea is worth doing or not generally get resolved by just spending a week implementing it and then testing it on a sample of users, e.g., 1% of Nevada users.

关于某个特性是否值得去做,基本上不花时间去争执。干一个星期的活,然后放给一小部分用户群(比如1%的内华达州用户)去测试来决定

* engineers generally want to work on infrastructure, scalability and “hard problems” — that’s where all the prestige is. can be hard to get engineers excited about working on front-end projects and user interfaces. this is the opposite of what you find in some consumer businesses where everyone wants to work on stuff that customers touch so you can point to a particular user experience and say “I built that.” At facebook, the back-end stuff like news feed algorithms, ad-targeting algorithms, memcache optimizations, etc. are the juicy projects that engineers want.

工程师一般来说都比较喜欢做做基础架构,高负载高并发,所谓"真正的技术难题"...等等涨声望值的东西。很难让一个工程师对用户界面修修补补而燃烧热情。在某些做consumer business的企业相反:每个人都希望做那些影响用户体验的事情这样他们就可以指着网页某处说:"介四俺做滴"。在FB,后端的工作比如newsfeed算法,广告精准投递算法,memcached优化,就是工程师最希望去做的事情(qyb:这一段不好翻译,谁能告诉我什么是 juicy project??)

* commits that affect certain high-priority features (e.g., news feed) get code reviewed before merge. News Feed is important enough that Zuckerberg reviews any changes to it, but that’s an exceptional case.
对那些高敏感度功能的代码提交,合并之前肯定要做codereview. News Feed 是最重要的部分,Zuckerberg 会亲自审查修改它的所有更改
* [CORRECTION -- thx epriest] “There is mandatory code review for all changes (i.e., by one or more engineers). I think the article is just saying that Zuck doesn’t look at every change personally.”

* [CORRECTION thx fryfrog] “All changes are reviewed by at least one person, and the system is easy for anyone else to look at and review your code even if you don’t invite them to. It would take intentionally malicious behavior to get un-reviewed code in.”

* no QA at all, zero. engineers responsible for testing, bug fixes, and post-launch maintenance of their own work. there are some unit-testing and integration-testing frameworks available, but only sporadically used.
FB没有QA,真的就是零个. 工程师负责测试,修补错误,发布后的维护。确实也有个单元测试集成测试框架,但很少被使用
* [CORRECTION thx fryfrog] “I would also add that we do have QA, just not an official QA group. Every employee at an office or connected via VPN is using a version of the site that includes all the changes that are next in line to go out. This version is updated frequently and is usually 1-12 hours ahead of what the world sees. All employees are strongly encouraged to report any bugs they see and these are very quickly actioned upon.”

"必须说FB是有QA的,只不过没有一个正式的QA团队。每个员工在内网使用系统的测试版本。版本经常升级,通常内部使用1-12个小时后就被发布到生产系统。强烈鼓励每个雇员去报告任何他们碰到的问题,这些问题也都飞快的得到响应"

* re: surprise at lack of QA or automated unit tests — “most engineers are capable of writing bug-free code. it’s just that they don’t have an incentive to do so at most companies. when there’s a QA department, it’s easy to just throw it over to them to find the errors.” [EDIT: please note that this was subjective opinion, I chose to include it in this post because of the stark contrast that this draws with standard development practice at other companies]
* [CORRECTION thx epriest] “We have automated testing, including “push-blocking” tests which must pass before the release goes out. We absolutely do not believe “most engineers are capable of writing bug-free code”, much less that this is a reasonable notion to base a business upon.”

"FB有自动测试,包括一旦出错就无法release的测试集合。我们完全不认同所谓'FB的绝大多数工程师有能力写出无错代码'这类提法",至少从商业风险的角度我们不会这么傲慢

* re: surprise at lack of PM influence/control — product managers have a lot of independence and freedom. The key to being influential is to have really good relationships with engineering managers. Need to be technical enough not to suggest stupid ideas. Aside from that, there’s no need to ask for any permission or pass any reviews when establishing roadmaps/backlogs. ”My product director doesn’t even really know all the things I have on my roadmap.” There are relatively few PMs, but they all feel like they have responsibility for a really important and personally-interesting area of the company.

re: 缺乏产品经理来影响/控制项目好像有点奇怪——但是产品经理有非常大的独立性和自由度。PM拥有影响力的关键是和工程经理搞好关系。产品经理需要有足够的技术头脑,别提傻想法,除了这点,产品经理制定其路线图的时候无需任何权限和额外许可。"我的产品主管并不完全了解我想做什么"。只有很少的产品经理,但所有PM们都很有责任心的去做那些真正重要,以及个人最感兴趣的部分。

* by default all code commits get packaged into weekly releases (tuesdays)

缺省所有的代码提交集成在一个周发布里(周二)

* with extra effort, changes can go out same day

通过额外的努力,提交也许可以被当天发布

* tuesday code releases require all engineers who committed code in that week’s release candidate to be on-site

周二的发布要求所有提交到候选版本里工程师都到场待命

* engineers must be present in a specific IRC channel for “roll call” before the release begins or else suffer a public “shaming”

在发布之前,工程师们必须在内部IRC里待命准备点名

* ops team runs code releases by gradually rolling code out
o facebook has around 60,000 servers
o there are 9 concentric levels for rolling out new code
o [CORRECTION thx epriest] “The nine push phases are not concentric. There are three concentric phases (p1 = internal release, p2 = small external release, p3 = full external release). The other six phases are auxiliary tiers like our internal tools, video upload hosts, etc.”
o the smallest level is only 6 servers
o e.g., new tuesday release is rolled out to 6 servers (level 1), ops team then observes those 6 servers and make sure that they are behaving correctly before rolling forward to the next level.
o if a release is causing any issues (e.g., throwing errors, etc.) then push is halted. the engineer who committed the offending changeset is paged to fix the problem. and then the release starts over again at level 1.
o so a release may go thru levels repeatedly: 1-2-3-fix. back to 1. 1-2-3-4-5-fix. back to 1. 1-2-3-4-5-6-7-8-9.
运维团队运行代码,逐步的将代码发布给所有人
  • FB有大概6w台服务器

  • 发布要分3个阶段:p1=内部发布、p2=小规模外部发布,p3=完全外部发布. 关于一些外围系统比如视频上载什么的被划到了另外6个发布阶段。一共是从p1到p9

  • 最小的发布级别只影响到6台服务器(qyb:我猜这意思是FB只要有6台服务器就可以运行所有的服务)

  • 周二发布就是p1,运维团队观察这6台服务器的运行情况,然后开始向下一个级别进行发布

  • 如果某个发布造成了错误. 整个进程就会中止. 提交相关代码的工程师会被叫过来修补代码,然后,再次从p1开始

* ops team is really well-trained, well-respected, and very business-aware. their server metrics go beyond the usual error logs, load & memory utilization stats — also include user behavior. E.g., if a new release changes the percentage of users who engage with Facebook features, the ops team will see that in their metrics and may stop a release for that reason so they can investigate.

运维团队非常。。。牛B闪闪。。。他们的控制面板上不仅仅有错误日志、系统负载、内存占用,他们还计算用户行为。如果某个发布后导致FB用户的某项行为特征的百分比也有所变化,控制面板上就会显示出来,然后他们就会中止发布,然后去寻找原因

* during the release process, ops team uses an IRC-based paging system that can ping individual engineers via Facebook, email, IRC, IM, and SMS if needed to get their attention. not responding to ops team results in public shaming.

在发布过程里,运维团队随时通过IRC去呼叫工程师。没有及时回应运维团队的开发者会被公开批判

* once code has rolled out to level 9 and is stable, then done with weekly push.

一旦发布完成了p9,本周发布就算结束了

* if a feature doesn’t get coded in time for a particular weekly push, it’s not that big a deal (unless there are hard external dependencies) — features will just generally get shipped whenever they’re completed.

* getting svn-blamed, publicly shamed, or slipping projects too often will result in an engineer getting fired. ”it’s a very high performance culture”. people that aren’t productive or aren’t super talented really stick out. Managers will literally take poor performers aside within 6 months of hiring and say “this just isn’t working out, you’re not a good culture fit”. this actually applies at every level of the company, even C-level and VP-level hires have been quickly dismissed if they aren’t super productive.

被svn-blamed(qyb:我猜测svn-blamed的意思是某人提交了一个特别弱智的bug,然后被svn blame命令检出这次提交的作者信息贴在内部邮件组里...也许FB定期公布这些工程师名单,被称之为svn-blamed),被公开批判的,项目常常延期。。。。这些过失都会导致被解雇。"这里有一个高绩效文化",不优秀的生产力不高的会被清除出去。新员工入职半年后如果表现不佳,就会被经理告知"这里不合适你". 甚至对于C级,vp级员工如果没有达到更高的预期也会被立即解雇.(qyb:看起来Mark之下只有4级,A/B/C/VP)

* [CORRECTION, thx epriest] “People do not get called out for introducing bugs. They only get called out if they ask for changes to go out with the release but aren’t around to support them in case something goes wrong (and haven’t found someone to cover for you).”
"如果只是写出了bug,工程师不会被公开点名。但要是发布出问题被要求支持的时候不在现场或者自己也没能找个替班的人,那就会被点名了"
* [CORRECTION, thx epriest] “Getting blamed will NOT get you fired. We are extremely forgiving in this respect, and most of the senior engineers have pushed at least one horrible thing, myself included. As far as I know, no one has ever been fired for making mistakes of this nature.”
"被svn-blamed的并不会被导致解雇。我们还是很宽大的。即使是资深开发工程师,大多数也避免不了被blamed,包括我。据我所知,没有人因为这种情况而被解雇"

* [CORRECTION, thx fryfrog] “I also don’t know of anyone who has been fired for making mistakes like are mentioned in the article. I know of people who have inadvertently taken down the site. They work hard to fix what ever caused the problem and everyone learns from it. The public shaming is far more effective than fear of being fired, in my opinion.”

Topic: 技术

工作状态

入睡前的好心情被老樊破坏了,打电话抱怨了一通技术不给力的现状。直接后果就是我3点多醒了后就没睡着觉,然后打开outlook和gmail把几份简历又看了一遍,想想该怎么去处理。

我有回很自嘲的对MarkWu说我现在就是一行政人员,被深深鄙视... 现在想起来真的是这样,拿军队作例子:

我同时有2-3场战役要打,还有几个局部地区的胶着战斗

手头兵力严重不足,要招募、训练好几十个人。不同军兵种、前线将领的不同性格......某一天能招到什么样的人几乎是随机的,还得一个个的想好把他们投放到那个战场上去,争取获得最大利益,而且别一跳伞就被消耗掉。每天都要把所有细节推演一遍

老樊一直想把几块局部战斗合并一个大的战役。这种思路是没错,就是为了规划和支援这个战役,要准备好多事情

其他几个区域随时有新情况发生,谁也不知道小溃败会不会引发连锁反应。刚刚还把一个同志空投到了福州去..

在总参总政总后总装干文职时间长了后,很难再回去带兵了

Topic: 技术

媒体,媒体

最近工作的半年,不仅仅是手头的事情增加了一部分,最关键的是,对“媒体机构性格”这个特征开始慢慢认识。

以前只关注 Google、Yahoo、Facebook...,和内容部有了更多的接触后,才发觉搜狐本质上可能更接近于 nytimes,wsj 这样的公司。它们美国社会的影响似乎不会因为出版业、渠道的变迁而削弱。

几个月前去圣水峪穿越,和张帆走在一起,他对新浪微博持怀疑态度。当时我心中恐怕是不以为然,现在非常理解他的观点——微博,包括twitter是很好的传播渠道,但是不可能在那个上面铸建媒体品牌。

阅读时间 vs 碎片时间

于威/张帆/方军/史彦等同志坚守的媒体价值,于用户那里,一定是在阅读时间里建立起来的。现在我们还需要阅读吗?或者换一个更尖锐的问题:搜狐未来还需要给用户提供媒体价值吗?

另外有一个很要命的事情是互联网技术产品圈对媒体价值的认可。一个梦想是传播是渠道是用户关系,另一个梦想是内容为王是新闻理念。

我不知道未来会怎样去调合,只能拿电影产业来做一个乐观的预估:它从无声变为有声,再变成DTS;从黑白变为彩色,再变为3D;从胶片变为数字;从家庭式的播映厅变成巨头垄断的院线发行,再变为iTunes。无数次被预言电影已死,也曾经我们几乎不去电影院而流连于录像厅。100年过去了,从火车进站到阿凡达,技术已然翻天覆地,但是电影还是电影。

媒体就是媒体。去微博拉用户,那是运营,不是编辑

也许,也许互联网门户仅仅只是媒体和资本的一个临时结合,不过现在能和这群媒体人同事,我觉得非常荣耀。

日人民报,CCAV一定会有崩盘的一天,希望门户能坚持到那个时候。

Topic: 商业

今天给一批新同学培训

最近部门里新进了不少即将毕业的大四学生,未来可能做编辑也可能是产品和运营。为了让这群菜鸟(基本上是文科生)以后可以和技术人员能有一个较好的交流基础,我列了10个题目来作培训,预计每个题目发展成一个2小时的课程:

  1. 计算机基础
  2. 网络基础和HTTP协议
  3. 网络运维
  4. 项目管理
  5. 软件测试以及SVN/JIRA使用
  6. 互联网存储
  7. WEB前端技术发展
  8. 开放平台和API
  9. 人工智能和个性推荐
  10. 无线互联网

头两个题目是我自己来准备的,今天讲第一节课。围绕着两个核心概念:缓存和并行;最后介绍了一下throughput/latency和concurrency的关系。

除了上课外,还聊了一下如何和程序员群体相处这个话题:

我:你们对程序员是一个什么印象?

答:宅...No Life...

我:首先,程序员的世界是0和1的世界,特别有规则,1+1一定等于2;程序员就是自己这个世界的主宰,创建一切。其次,程序员独立性很强,无需太多的紧密协作就可以创造出价值,在今天优秀的程序员养活自己弄份体面的工作是很轻松的,不用特意去讨好谁。和他们打交道的时候你们一定要记住这两点

问:程序员在哪些方面最容易和产品冲突

我:关于进度工期。程序员通常不愿意预估工期,主要是由于项目中未知的因素太多,使得准确预估特别困难;而且项目启动后,也有可能会变更需求,从而造成更多的不确定性。基本上那个deadline最后会成为一个政治目标,而非产品技术目标;在这种情况下意见不统一而产生的bug,会更容易引发冲突。

我:早期程序都是由程序员来决定用户界面的,程序员按照自己的思维方式来进行设计。你们必须要让程序员确认,你们有一套很完备的思路和模式,去研究分析用户的真正所需所想。要让程序员确认你们代表着用户利益来指出这里有问题,那个是bug。否则一定会冲突,估摸着怎么也得半年你们才能一起磨合出来。

问:程序是不是都有bug

我:hoho,看来你们不知道高爷爷和TeX的故事...(此处略去1000字)...总之,好程序员比一般人生产力高100倍是肯定的

问:怎么样识别好程序员呢?

我:你让两个人分别去盖个平房,刚刚盖好看起来都差不多;然后你让他们在这个房子基础上加盖三层,同时往下打个地下室和车库,分别立刻就出来了。

问:女生能做程序员吗?

我:当然能。主要的问题是从事这个行业的女性太少,在搜狐大概是1:5-1:7的样子,所以出类拔萃的女程序员也少。就我过去的经历,每遇到5-7个靠谱的男程序员,也能碰到1个靠谱的女程序员。要作一个优秀的程序员,肯定要克服很多很多困难,无论男女

Topic: 技术

2010/2011 杂记

发烧了,跨年之病。并不算严重,37.5' 上下,新的一年就这么开头了

淘宝上给达达买了一件CARAVA的冲锋衣。做工不错,袖子上的细节也很见厂商的用心。那些流入水货市场的户外产品基本不会有童装的,要给小孩买东西,这个牌子我觉得还是可以的

邱可心最近学会了第一句真正意义上的脏话——《卑鄙的我》的字幕组太给力了,里面充斥着“哥”、“爷”、“神马”这样的翻译,包括把"silly bean"翻译成了"傻B"。从发音上来看,这个翻译还挺不错的,但我开始担心回头她在学校里说脏话被请家长。老婆安慰说学校只有学习不好才请家长,这种事情也就写个检讨反思,哈!

看邱可心用"灰"组词,她先是组了一个"灰尘",然后组了一个"骨灰"..... :(

发现邱可心在造句的时候引用她爹弹琴很棒这件事,很是得意。

用"虽然...但是"造句,"虽然我学习很努力,但是怎么也考不了100分",小学二年级的心理哀愁

===============

翻了下去年的"2009/2010",实现了的愿望就一个"冠军杯"

2010 整个工作被割裂成两部分,下半年开始对搜狐有了更多的理解。

希望新年能给团队给搜狐带来大变化。新年愿望:
1. 对敏感词的诅咒生效
2. 父母健康平安

3. 爬一座5000+的山

就这三条吧,人不能奢望太多

Topic: dada 生活

再出发

又到一年的最后一天了。

今年这一年过得忙碌并且充实。从上半年的见双方家长、筹备结婚到下半年的结婚、买车。紧紧张张,一年就这么过去了。每个周末每个节假日几乎都是按照计划有目的地度过的。所幸并没有太辛苦太累,这一切的一切,都要感谢我美丽贤惠的妻子和不遗余力支持我们的父母兄嫂。

明年肯定不会轻闲,年龄越来越大,责任也越来越重。没什么太多想说的,继续今年的势头,再出发吧。

Topic: 生活

ノルウェイの森  (2010)

http://www.norway-mori.com/, 12/11 日上映,估计只能等从电驴上下载了,就算进中国也会被剪...

话说当年发现搜狐博客的slogan是"相逢的人会再相逢",很是吃了一惊,因为怎么也感觉不出来博客团队的几位领导有这么浓郁的文青气息。。。前个月idk小妹妹还给我说这个slogan的事情,以为是我的主意。现在博客多多少少和我有了点关系,就算不为老樊也要冲村上的面子,得好好支持一下,呵呵

Topic: 文化 音乐

Last Christmas

无意中发现最近两天总在收音机里放的这首圣诞歌原唱是威猛乐队。兴之所至去优酷搜索了一下,那时候乔治迈克尔好青春啊,现在已经是大叔升级版了...

Last christmas, I gave you my heart
But the very next day you gave it away
This year, To save me from tears
I'll give it to someone special

Topic: 生活 音乐
订阅 RSS - 博客 | BT的花