December 20, 2007 at 3:59 am
· Filed under Research, System
Currently, Data Center applications are more and more important to the company such as microsoft, google, amazon. In this senario, Should we still need the TCP/IP network stack? We know the TCP/IP is target for complexity environment, the routering, the failure handling. Those are unnecessary for the data center, maybe it is time for us to redesign the network stack in the data center.
Here are some of my initial ideas about this topic.
We partition the machines in data center into several groups. In each group, the machines are all connected. We don’t need maintain the connections, the resent machenis.
December 19, 2007 at 8:10 am
· Filed under Research
System Assumptions and Requirements
Query Model: simple read and write operations to a data item that is uniquely identified by a key.
Dynamo targets applications that operate with weaker consistency if this results in high availability. Dynamo does not provide any isolation guarantees and permits only signle key updates.
Efficiency: The system needs to function on a commodity hardware infrastructure. services have stringent latency reqquirments which are in genneral measured at the 99.9th precentile of the distribution. (it will provide a response within 300ms for 99.9% of its requests fro a peak client load of 500 requests per second.)
Design Considerations
Weak Consistence ( eventually consistent)
Application resolves conflicts (always writable)
Incremental scalability
Symmetry
Decentralization
Hetergeneity
System Interface:
get(key) : return a single object or a list of objects with conflicting versions along with a context
put(key, contect, object)
Experiences and lessons learned
The main advantage of Dynamo is that its client applications can tune the values of N, R and W to achive their desired levels of performances, availability and durability.
Using an object buffer in each node main memory. Each write operation is stored in the buffer and gets periodically written to storage by a writer thread.
99.94% of requests saw exactly one verison;0.00057% of requests saw 2 versions; 0.00047% of requests saw 3 versions and 0.00009% of requests saw 4 versions (amazing)
client-driven coordination is better than server-driven coordination.
PThread主要在Unix系统上使用。Unix的实现系统很多,比如Linux、FreeBSD、Solaris、Mac OS X等。要在众多“类UNIX”上开发跨平台的多线程应用,绝非易事,因此制定了POSIX Thread标准。David R. Butenhof(Boost库发起者之一,ISO C++标准委员会成员)的《Programming with POSIX Threads》这本书,可以说是Unix上编写多线程应用的必备参考书。对其他平台并行程序开发也有很高参考价值。
continuation design is good, avoid a lot of unused branchs.
For some critical code piece, Choose instructions to dual-issue well, Fixed word structure allows prefetch, Avoid branch mispredictions.
branch mispredications will suffer a lot of performance, but the question is that in x86 how to reduce the branch mispredication, I am very familiar with how to optimize the application in RISC, but not x86 micro-architecuture. Now I use the Vtune tools to get the performance results but don’t know how to reduce the branch mispredications. If you have some ideas, please tell me. thanks a lot.
The interface for index stream readers(ISRs): loc(), next(), seek(X)
constraint solve processing is very critical to index serve.
Queries take about 100 cycles/query/MByte(AltaVista), 1.5G index size
30% inner loop, 15% constraint solver, 15% higher level seek code, 7% ranking code, 0.2% merging results, Miss ratios: 2% I-cache, 8% D-cache, 8% level-2 cache, 40% level-3 cache. It seems that the current ranking algorithm is more complex than the one in 2000.
I found an old paper “Scale in Distributed Systems” published at 1994, but it is quite useful, it summaries the problems we will ecounter when we want to deploy our system to very large scale, The paper gives us a set of principles for scalable systems along with a list of questions to be asked when considering how far a system scales, if you need to design a distributed system recently, maybe it is worth to read.
December 7, 2007 at 7:19 am
· Filed under Research, System
Today, we discuss a sosp paper “Protection and Communication Abstractions
for Web Browsers in MashupOS”, as we known that MashupOS is very popular now, maybe it is the killer to the tradition operation system, maybe in the future, we can do anything by the brower, the data are kept in the internet, in some big services companies like google, facebook, yahoo, live, msn. The paper is to give communication abstractions for browsers and how to protect them. we know that, the current browsers do not support the communications between the different domains, but MashupOS need these communication, so maybe in the future the browsers will support these functionalities, so in that point, browsers will be like a new operation system. How to avoid those problems we ecounter now (memory leak, buffer overflow) ? This paper is to want to solve these problems in the mashup os.
November 23, 2007 at 7:27 am
· Filed under Research
The current network simulator is too detailed (packet level). from this talk, it seems that the accuracy is quite fragile, so maybe we should just use the machines, the switches to build a simple test-best. Is that enough for us when we want to build some geodistributed applications at the beginning stage?
Recently, I am thinking about how to build up a geodistributed service, as you known, the geo-test bed is hard to build up, since we have planetlab, but that is a shared cluster, you cannot exclusive to use this cluster which is different to the real.
So i think we need to build a simulation to simulate the geo-environment so that we can refine our raw design on it before we deploy it to the real world. this simulation platform should be transparent to the application, and has a topology layer so the user can define his own special topology.
I know there are dozens of similiar simulator, but which one is fit for our requirements? ns2? if you are the expert of the network simulator, please give me some advices. thanks a lot.