Archive for December, 2007

how to optimize memory usage in your application

Advertisements

Leave a Comment

building a javascript library (jQuery)

http://video.google.com/videoplay?docid=-474821803269194441

jQuery is very popular library. Maybe someday i should look through it.

How to write a good library?

  • write a solid api ( made a grid, filled in the blanks)
  • Fear Adding Methods ( Methods can cause a support nightmare; avoid adding, if you can; defer to extensibility)
  • Embrace Removing code
  • Provide an Upgrade path
  • Reduce to a common root
  • Consistency
    •  naming scheme and stick with it
    • argument position (options, arg2, … , callback)
    •  Callback context.
  •  Namespacing (Questios to ask)
    • Can my code coexist with other random code on the site?
    • Can my code coexist with other copies of my own library?
    • Can my code be embedded inside another namespace?
  •  Perform Type Checking
    • Make your API more fault resistant
    • Correct values whenever possible
    • Error message
  • Errors
    • Never gobble errors
    • Ignore the templation to try { … } catch(e) {}
    • Improves debug-ability for everyone [weil: Mike burrows give the same suggestion, he said we should give a assert violation as earlier as we can.]
  • Extensibility
    • Your code should be easily extensible
    • Write less, defer to others
    • Makes for cleaner code
    • Foster community and growth
  • Documentation
    • structured (provide a clear format, users can build new views with it, An API for your API!)
    • Users want to help
      • Make barries to helping very low
      • Keep your docs in a wiki
      • Only do this if you’ve already written all of your doces
      • Use template to maintain structure.
    • Write the Docs Yourself
      • It isn’t glamorous, but it’s essential
      • You must buckle-down and do it yourself
      • Improves your longevity
  • Tesiting (1000% Essential)
    • Test-driven development
      • wirte test cases before your tackle the bugs
      • find devs who love to write test cases
      • check for failures before commit
  • Maintain Focus
    • very very important

Leave a Comment

Gear and mashup problem (speaker: yahoo)

 http://video.google.com/videoplay?docid=452089494323007214

The Yoda of lambda prgramming and google gear.

any damn fool could produce a better data format than XML (James Clark, 2007-04-06)

 Java

  • Java was a huge failure
  • Very popular, high acceptance
  • “Write once, run everywhere” promise not kept
  • Unworkable “blame the victim” security model.
  • Tedious UI model.
  • Seccessful as a server technology.

Ajax

  • Applications without installation
  • Highly interactive
  • High social potential
  • Easy to use
  • Great network efficiency
  • but it is too damn hard to write applications

Mashups: The most interesting innovation in  software development in 20 years, but mashups are insecure, mashups must not have access to any confidential informations

Why?

Javascript dumps all programs into a common global space; There is nothing to protect the secrets of one component from another; Any infromation in any component is visible to all other components.

 Drivers of innovation

  1. Proposal
  2. Standard
  3. Browser Makers
  4. Application Developers

Leave a Comment

What kind of the thing is worth to do? We should do some jobs that will impact the others

Sometimes, I don’t like the computer system research, because seldom research is very useful for the others. people always play some trick games. I hate that.

Leave a Comment

Data Center Networking

Currently, Data Center applications are more and more important to the company such as microsoft, google, amazon. In this senario, Should we still need the TCP/IP network stack? We know the TCP/IP is target for complexity environment, the routering, the failure handling. Those are unnecessary for the data center, maybe it is time for us to redesign the network stack in the data center.

Here are some of my initial ideas about this topic.

We partition the machines in data center into several groups. In each group, the machines are all connected. We don’t need maintain the connections, the resent machenis.

1.jpg

Comments (1)

Dynamo: Amazon’s Highly Available Key-Value Store

System Assumptions and Requirements

  • Query Model: simple read and write operations to a data item that is uniquely identified by a key.
  • Dynamo targets applications that operate with weaker consistency if this results in high availability. Dynamo does not provide any isolation guarantees and permits only signle key updates.
  • Efficiency: The system needs to function on a commodity hardware infrastructure. services have stringent latency reqquirments which are in genneral measured at the 99.9th precentile of the distribution. (it will provide a response within 300ms for 99.9% of its requests fro a peak client load of 500 requests per second.)

Design Considerations

  • Weak Consistence ( eventually consistent)
  • Application resolves conflicts (always writable)
  • Incremental scalability
  • Symmetry
  • Decentralization
  • Hetergeneity

System Interface:

  • get(key) : return a single object or a list of objects with conflicting versions along with a context
  • put(key, contect, object)

Experiences and lessons learned

  • The main advantage of Dynamo is that its client applications can tune the values of N, R and W to achive their desired levels of performances, availability and durability.
  • Using an object buffer in each node main memory. Each write operation is stored in the buffer and gets periodically written to storage by a writer thread.
  • 99.94% of requests saw exactly one verison;0.00057% of requests saw 2 versions; 0.00047% of requests saw 3 versions and 0.00009% of requests saw 4 versions (amazing)
  • client-driven coordination is better than server-driven coordination.
  • Balancing background vs. foreground tasks.

Leave a Comment

The Free Lunch Is Over: A Fundamental Turn Toward Concurrency in Software

http://www.gotw.ca/publications/concurrency-ddj.htm

并行编程中必须考虑的两个问题是被处理数据和任务间通讯。经过用户的选择与市场的淘汰,现在的并行编程标准基本上趋向以下三种:

1、数据并行。特点,各任务处理的数据彼此分离,任务间通过消息传递进行通讯;数据分离和消息传递工作由编译器完成。

HPF(High Performance Fortran,高性能Fortran)是典型的数据并行编程语言。因为目前的编译器技术对实际应用中各种不规则问题的解决方案仍不够理想,加上专注于数据并行,因此HPF未获广泛应用。

2、消息传递。特点,各任务处理的数据彼此分离,任务间通过消息传递进行通讯;数据分离和消息传递工作由程序员和用户完成,因此对程序员要求很高。这种模式非常适用于消息传递的体系结构(如机群系统),用户和程序员主要需考虑的是通讯同步和通讯性能问题。

并行虚拟机(PVM,Parallel Virtual Machine)和消息传递接口(MPI,Message Passing Interface)是两种广泛使用的消息传递并行编程标准。其中PVM侧重异构环境下的可移植性和互操作性;MPI更强调性能,但在异构环境下有不同的实现。几乎所有的高性能计算系统都支持PVM和MPI。

3、共享内存。特点,各任务处理的数据实现内存共享,任务间也通过共享数据实现通讯;数据共享可由程序员或编译器完成。共享内存并行编程主要应用在对称多处理器(SMP ,Symmetric Multi Processors)系统上。

OpenMP(Open MultiProcessing由X3H5发展而来)和PThread(POSIX Thread)都是共享内存并行编程的实现。

OpenMP由1993年建立的X3H5标准发展而来,目前已成共享内存并行编程的实际工业标准,得到DEC、Intel、IBM和Sun等厂商广泛支持。它在Forthan、C/C++得到了实现,主要支持隐式并行编程,即编译器实现并行。

PThread主要在Unix系统上使用。Unix的实现系统很多,比如Linux、FreeBSD、Solaris、Mac OS X等。要在众多“类UNIX”上开发跨平台的多线程应用,绝非易事,因此制定了POSIX Thread标准。David R. Butenhof(Boost库发起者之一,ISO C++标准委员会成员)的《Programming with POSIX Threads》这本书,可以说是Unix上编写多线程应用的必备参考书。对其他平台并行程序开发也有很高参考价值。

总的来说,共享内存并行编程与目前大多数的多线程程序员思维习惯最为接近,是程序员从单核转向多核系统需付代价最小的方案。但专家仍有不同意见,比如Herb Sutter就不看好OpenMP,因为共享内存并行编程本质上并没有太多改进,仍然依赖数据资源的锁定,这会带来性能问题。消息传递并行有性能优势,但对程序员的要求又太高了。所有这些难题,还需要研究并行和各种标准、库的专家继续努力解决。

Comments (1)

Older Posts »